Difference between pages "Making the Distribution, Part 2" and "Package:Nftables"

(Difference between pages)
(this needs daniels review!)
 
 
Line 1: Line 1:
{{Article
+
{{Ebuild
|Summary=In his previous article, Daniel Robbins told the story of how he became a Stampede Linux developer and why he eventually left Stampede to start the Enoch Linux distribution. In this go-round he lets you in on the strange events that happened after the Enoch development team discovered a little-known, blazingly fast compiler.
+
|Summary=Linux kernel (3.13+) firewall, NAT and packet mangling tools
|Article Category=General
+
|CatPkg=net-firewall/nftables
|Author=Drobbins
+
|Repository=Gentoo Portage Tree
|Previous in Series=Making the Distribution, Part 1
+
|Next in Series=Making the Distribution, Part 3
+
 
}}
 
}}
== From Enoch to Gentoo, via minor setbacks and corporate run-ins ==
+
=== What is nftables? ===
 +
'''nftables''' is the successor to [[iptables]]. It replaces the existing iptables, ip6tables, arptables and ebtables framework. It uses the Linux kernel and a new userspace utility called nft. nftables provides a compatibility layer for the ip(6)tables and framework.
  
=== First steps to Enoch ===
+
==Introduction==
 +
As with the iptables framework, nftables is build upon rules which specify the actions. These rules are attached to chains. A chain can contain a collection of rules and is registered into the netfilter hooks. Chains are stored inside tables. A table is specific for one of the layer 3 protocols. One of the main differences with iptables is that there are no predefined tables and chains anymore.
  
In my previous article, I gave you the low-down on my days with the Stampede development team and why I left (to get away from lower-level politically-minded, project-controlling "freaks"). Because of the interference from these meddlesome by-standers, I figured it would be easier to put together my own Linux distribution than to continue improving Stampede under such dirty conditions! Fortunately I took with me a considerable amount of experience based on my (may I say substantial?) work for Stampede, including maintaining several of their packages, designing the initialization scripts, and leading the slpv6 (next-generation package management project).
+
===Tables===
 +
A table is nothing more than a container for your chains. With nftables there are no predefined tables (filter, raw, mangle...) anymore. You are free to recreate the iptables-like structure, but anything might do.
 +
Currently there are 5 different families of tables:
 +
* '''ip''': Used for IPv4 related chains;
 +
* '''ip6''': Used for IPv6 related chains;
 +
* '''arp''': Used for ARP related chains;
 +
* '''bridge''': Used for bridging related chains;
 +
* '''inet''': Mixed ipv4/ipv6 chains (kernel 3.14 and up).
  
The distribution I began working on, code-named Enoch, was going to be blazingly fast because it would completely automate the package creation and upgrading process. I have to admit that this was in large part because I was a one-member team and couldn't afford to spend my time on repetitive work that my development box could be automated to do for me. And since I was designing a complete distribution from scratch (rather than "spinning off" from someone like RedHat), I had my work cut out for me and needed all the free time I could scrounge up.
+
It is not hard to recognize the old tables framework in these tables. The only new one is the inet table which is used for both IPv4 and IPv6 traffic. It should make firewalling for dual-stack hosts easier by combining the rules for IPv4 and IPv6.
  
After getting my basic Enoch system up and running, I headed back to irc.openprojects.net and started my own channel called #enoch. From there I gradually assembled a team of about ten developers. In those early days we all hung out on IRC and worked on the distribution in our spare time. As we communally and cooperatively hacked away at it, finding and fixing new bugs, Enoch became more functional and professional every day.
+
===Chains===
 +
Chains are used to group together rules. As with the tables, nftables does not have any predefined chains. Chains are grouped in base and non-base types. Base chains are registered in one of the netfilter hooks. A base chain has a hook its registered with, a type and a priority. Non-base chains are not attached to a hook and they don't see any traffic by default. They can be used to arrange a rule-set in a tree of chains.
 +
There are currently three types of chains:
 +
* '''filter''': for filtering packets
 +
* '''route''': for rerouting packets
 +
* '''nat''': for performing Network Address Translation. Only the first packet of a flow hits this chain, making it impossible to use it for filtering.
 +
The hooks that can be used are:
 +
* '''prerouting''': This is before the routing decision, all packets entering the machine hits this chain
 +
* '''input''': All packets for the local system hits this hook
 +
* '''forward''': Packets not for the local system, those that need to be forwarded hits this hook
 +
* '''output''': Packets that originate from the local system pass this hook
 +
* '''postrouting''': This hook is after the routing decision, all packets leaving the machine hits this chain
 +
{{Note|The ARP address family only supports the input and output hook}}
 +
{{Note|The bridge address family only seems to supports the input, forward and output hook}}
  
=== The first roadblock ===
+
====Priorities====
 +
{{Note| Priorities do not currently appear to have any effect on which chain sees packets first.}}
 +
{{Note| Since the priority seems to be an unsigned integer, negative priorities will be converted into very high priorities.}}
  
One inevitable day, Enoch hit its first roadblock. After adding Xfree86, glib, and gtk+, I decided to get xmms (an X11/gtk+-based MP3/CD player app) working. I figured it was time to celebrate with some music! But after installing xmms, I tried to start it... and X locked up! At first I thought xmms locked up because I used insane compiler optimizations ("-O6 -mpentiumpro", in case you were wondering). My first thought, to compile xmms with standard optimizations, didn't solve the problem. So I started looking elsewhere. After spending a full week of development time trying to track down the problem, I got an e-mail from an Enoch user, Omegadan, who was also experiencing xmms lockups.
+
===Rules===
 +
Rules specify which action has to be taken for which packets. Rules are attached to chains. Each rule can has an expression to match packets with and one or multiple actions when matching. Main differences with iptables is that it is possible to specify multiple actions and that by default counters are off. It must be specified explicitly in rules if you want packet- and byte-counters for a rule.
 +
Each rule has a unique handle number by which it can be distinguished.
 +
The following matches are available:
 +
* '''ip''': IP protocol
 +
* '''ip6''': IPv6 protocol
 +
* '''tcp''': TCP protocol
 +
* '''udp''': UDP protocol
 +
* '''udplite''': UDP-lite protocol
 +
* '''sctp''': SCTP protocol
 +
* '''dccp''': DCCP protocol
 +
* '''ah''': Authentication headers
 +
* '''esp''': Encrypted security payload headers
 +
* '''ipcomp''': IPcomp headers
 +
* '''icmp''': icmp protocol
 +
* '''icmpv6''': icmpv6 protocol
 +
* '''ct''': Connection tracking
 +
* '''meta''': meta properties such as interfaces
  
We corresponded for a while, and after many hours of testing we determined that the problem was a POSIX threads-related issue. For some reason, a pthread_mutex_trylock() call did not return the way it should. As the creator of a distribution, these were the types of bugs I really didn't want to encounter. I counted on the developers to release perfect sources so I could focus on enhancing the Linux experience rather than getting buggy sources to work. Of course I soon learned that this was an unrealistic expectation, and that problems like this will always pop up from time to time.
+
====Matches====
 +
{|class=wikitable
 +
| Match
 +
| Arguments
 +
| Description/Example
 +
|-
 +
| rowspan="11" | '''ip'''
 +
| version
 +
| Ip Header version
 +
|-
 +
| hdrlength
 +
| IP header length
 +
|-
 +
| tos
 +
|Type of Service
 +
|-
 +
| length
 +
| Total packet length
 +
|-
 +
| id
 +
| IP ID
 +
|-
 +
| frag-off
 +
| Fragmentation offset
 +
|-
 +
| ttl
 +
| Time to live
 +
|-
 +
| protocol
 +
| Upper layer protocol
 +
|-
 +
| checksum
 +
| IP header checksum
 +
|-
 +
| saddr
 +
| Source address
 +
|-
 +
| daddr
 +
| Destination address
 +
|-
 +
| rowspan="8" | '''ip6'''
 +
| version
 +
| IP header version
 +
|-
 +
| priority
 +
|
 +
|-
 +
| flowlabel
 +
| Flow label
 +
|-
 +
| length
 +
| Payload length
 +
|-
 +
| nexthdr
 +
| Next header type (Upper layer protocol number)
 +
|-
 +
| hoplimit
 +
| Hop limit
 +
|-
 +
|saddr
 +
| Source Address
 +
|-
 +
|daddr
 +
| Destination Address
 +
|-
 +
| rowspan="9" | '''tcp'''
 +
| sport
 +
| Source port
 +
|-
 +
| dport
 +
| Destination port
 +
|-
 +
| sequence
 +
| Sequence number
 +
|-
 +
| ackseq
 +
| Acknowledgement number
 +
|-
 +
| doff
 +
| Data offset
 +
|-
 +
| flags
 +
| TCP flags
 +
|-
 +
| window
 +
| Window
 +
|-
 +
| checksum
 +
| Checksum
 +
|-
 +
| urgptr
 +
| Urgent pointer
 +
|-
 +
| rowspan="4" | '''udp'''
 +
| sport
 +
| Source port
 +
|-
 +
| dport
 +
| destination port
 +
|-
 +
| length
 +
| Total packet length
 +
|-
 +
| checksum
 +
| Checksum
 +
|-
 +
| rowspan="4" | '''udplite'''
 +
| sport
 +
| Source port
 +
|-
 +
| dport
 +
| destination port
 +
|-
 +
| cscov
 +
| Checksum coverage
 +
|-
 +
| checksum
 +
| Checksum
 +
|-
 +
| rowspan="4" |'''sctp'''
 +
| sport
 +
| Source port
 +
|-
 +
| dport
 +
| destination port
 +
|-
 +
|vtag
 +
|Verification tag
 +
|-
 +
| checksum
 +
| Checksum
 +
|-
 +
| rowspan="2" |'''dccp'''
 +
| sport
 +
| Source port
 +
|-
 +
| dport
 +
| destination port
 +
|-
 +
| rowspan="4" |'''ah'''
 +
| nexthdr
 +
| Next header protocol (Upper layer protocol)
 +
|-
 +
| hdrlength
 +
| AH header length
 +
|-
 +
| spi
 +
| Security Parameter Index
 +
|-
 +
| sequence
 +
| Sequence Number
 +
|-
 +
| rowspan="2" | '''esp'''
 +
| spi
 +
| Security Parameter Index
 +
|-
 +
| sequence
 +
| Sequence Number
 +
|-
 +
| rowspan="3" | '''ipcomp'''
 +
| nexthdr
 +
| Next header protocol (Upper layer protocol)
 +
|-
 +
| flags
 +
| Flags
 +
|-
 +
| cfi
 +
| Compression Parameter Index
 +
|-
 +
| '''icmp'''
 +
| type
 +
| icmp packet type
 +
|-
 +
| '''icmpv6'''
 +
| type
 +
| icmpv6 packet type
 +
|-
 +
|rowspan="12"|'''ct'''
 +
|state
 +
|State of the connection
 +
|-
 +
|direction
 +
|Direction of the packet relative to the connection
 +
|-
 +
|status
 +
|Status of the connection
 +
|-
 +
|mark
 +
|Connection mark
 +
|-
 +
|expiration
 +
|Connection expiration time
 +
|-
 +
|helper
 +
|Helper associated with the connection
 +
|-
 +
|l3proto
 +
|Layer 3 protocol of the connection
 +
|-
 +
|saddr
 +
|Source address of the connection for the given direction
 +
|-
 +
|daddr
 +
|Destination address of the connection for the given direction
 +
|-
 +
|protocol
 +
|Layer 4 protocol of the connection for the given direction
 +
|-
 +
|proto-src
 +
|Layer 4 protocol source for the given direction
 +
|-
 +
|proto-dst
 +
|Layer 4 protocol destination for the given direction
 +
|-
 +
| rowspan="13" | '''meta'''
 +
| length
 +
| Length of the packet in bytes: ''meta length > 1000''
 +
|-
 +
| protocol
 +
| ethertype protocol: ''meta protocol vlan''
 +
|-
 +
| priority
 +
| TC packet priority
 +
|-
 +
| mark
 +
| Packet mark
 +
|-
 +
| iif
 +
| Input interface index
 +
|-
 +
| iifname
 +
| Input interface name
 +
|-
 +
| iiftype
 +
| Input interface type
 +
|-
 +
| oif
 +
| Output interface index
 +
|-
 +
| oifname
 +
| Output interface name
 +
|-
 +
| oiftype
 +
| Output interface hardware type
 +
|-
 +
| skuid
 +
| UID associated with originating socket
 +
|-
 +
| skgid
 +
| GID associated with originating socket
 +
|-
 +
| rtclassid
 +
| Routing realm
 +
|-
 +
|}
 +
====Statements====
 +
Statements represent the action to be performed when the rule matches. They exist in two kinds: Terminal statements, unconditionally terminate the evaluation of the current rules and non-terminal statements that either conditionally or never terminate the current rules. There can be an arbitrary amount of non-terminal statements, but there must be only a single terminal statement.
 +
The terminal statements can be:
 +
* '''accept''': Accept the packet and stop the ruleset evaluation.
 +
* '''drop''': Drop the packet and stop the ruleset evaluation.
 +
* '''reject''': Reject the packet with an icmp message
 +
* '''queue''': Queue the packet to userspace and stop the ruleset evaluation.
 +
* '''continue''':
 +
* '''return''': Return from the current chain and continue at the next rule of the last chain. In a base chain it is equivalent to accept
 +
* '''jump <chain>''': Continue at the first rule of <chain>. It will continue at the next rule after a return statement is issued
 +
* '''goto <chain>''': Similar to jump, but after the new chain the evaluation will continue at the last chain instead of the one containing the goto statement
  
As it turned out, the problem wasn't with xmms, gtk+, or glib. And it wasn't an issue with Xfree86 3.3.5 not being thread-safe and locking up. Surprisingly, we found the bug in the Linux POSIX threads implementation itself, part of the GNU C library (glibc) version 2.1.2. I was shocked at the time to find that such a critical part of Linux had such a major bug. (And we used a release version of glibc in Enoch, not a prerelease or CVS version!).
+
== Installing nftables ==
 +
=== Kernel ===
 +
These kernel options must be set:
  
So how did we track down the problem? Actually, we never were able to come up with a bug fix, but at one point I stumbled across a couple of e-mails on the glibc developer mailing list from another person who had the same problem. The glibc developer who replied posted a patch that solved the thread problem for us. But I was curious why RedHat 6 (which also used glibc 2.1.2) didn't suffer from this problem since the patch was just posted and RedHat 6 had been available for some time. To find out, I downloaded RedHat's glibc SRPM (source RPM) and took a look at their patches.
+
[*] Networking support  --->
 +
    Networking options  --->
 +
        [*] Network packet filtering framework (Netfilter) --->
 +
            Core Netfilter Configuration  --->
 +
                <M> Netfilter nf_tables support
 +
                <M>  Netfilter nf_tables IPv6 exthdr module
 +
                <M>  Netfilter nf_tables meta module
 +
                <M>  Netfilter nf_tables conntrack module
 +
                <M>  Netfilter nf_tables rbtree set module
 +
                <M>  Netfilter nf_tables hash set module
 +
                <M>  Netfilter nf_tables counter module
 +
                <M>  Netfilter nf_tables log module
 +
                <M>  Netfilter nf_tables limit module
 +
                <M>  Netfilter nf_tables nat module
 +
                <M>  Netfilter x_tables over nf_tables module
 +
            IP: Netfilter Configuration  --->
 +
                <M> IPv4 nf_tables support
 +
                <M>  nf_tables IPv4 reject support
 +
                <M>  IPv4 nf_tables route chain support
 +
                <M>  IPv4 nf_tables nat chain support
 +
            IPv6: Netfilter Configuration  --->
 +
                <M> IPv6 nf_tables support
 +
                <M>  IPv6 nf_tables route chain support
 +
                <M>  IPv6 nf_tables nat chain support
 +
            <M>  Ethernet Bridge nf_tables support
  
RedHat had their own homegrown glibc patch that solved the pthread_mutex_trylock() issue. Apparently they experienced the same problem and created their own custom fix. Too bad they didn't send this patch "upstream" to the glibc developers so it could be shared with the rest of the world. But who knows, maybe RedHat sent the patch upstream and for some reason the glibc developers didn't accept it. Or maybe the thread bug was triggered by a specific combination of compiler and binutils versions, and RedHat never ran into it (although they did have a thread patch in their SRPM). I suppose we'll never know exactly what happened. But I did learn that RedHat SRPMs contain a lot of private bug fixes and tweaks that never seem to make it upstream to the original developers. I'm going to rant about this for a little while.
+
=== Emerging ===
 +
To install nftables, run the following command:
 +
<console>
 +
###i## emerge net-firewall/nftables
 +
</console>
  
=== Rant ===
 
  
When you put together a Linux distribution it's really important that any bug fixes you create are sent upstream to the original developers. As I see it, this is one of the many ways that distribution creators contribute to Linux. We're the guys who actually get all these different programs working as a unified whole. We should send our fixes upstream as we unify so that other users and distributions can benefit from our discoveries. If you decide to keep bug fixes to yourself, you're not helping anyone; you're just ensuring that a lot of people will waste time fixing the same problem over and over again. This kind of policy goes against the whole open source ethic and stunts the growth of Linux development. Maybe I should say that it "bugs" us all.
+
== OpenRC configuration ==
 +
Don't forget to add nftables service to startup:
 +
<console>
 +
###i## rc-update add nftables default
 +
</console>
  
It's unfortunate that some distributions (ahem) aren't as good (RedHat) as others (Debian) about sharing their work with the community.
+
You cannot use iptables and nft to perform NAT at the same time. So make sure that the iptable_nat module is unloaded. Remove iptables_nat module:
 +
<console>
 +
###i## rmmod iptable_nat
 +
</console>
  
=== Compiler drama ===
+
Start nftables:
 +
<console>
 +
###i## /etc/init.d/nftables start
 +
</console>
  
During the time we were trying to fix the glibc threads problem, I e-mailed Ulrich Drepper (one of the guys at Cygnus who is heavily involved with glibc development). I mentioned the POSIX thread problem we were having, and that Enoch was using pgcc for optimum performance. And he responded with something like this (I'm paraphrasing here): "Our own compiler included with the CodeFusion product has an excellent x86 backend that produces executables far faster than those generated with pgcc." Obviously, I was very interested in testing out this mystery "turbo" compiler the Cygnus guys had created.
 
  
I thereupon requested a demo copy of Cygnus Codefusion 1.0 so that I could test it out, and Omegadan and I were amazed to find that this compiler was everything that Ulrich claimed and then some more. The x86 backend increased the performance of some of the CPU-intensive executables (like bzip2) by close to 90%! All applications seemed to benefit from at least a 10% real-world performance increase, and all we did was swap out compilers. Enoch even booted 30 - 40% faster. The performance gains were far, far greater than what we gained by switching from gcc to pgcc. Obviously, after experiencing it for ourselves, we wanted to use this compiler for Enoch. Fortunately, the sources were included on the CodeFusion CD and were released under the GPL, so we were fully permitted to use this compiler... or so we thought.
+
== Using nftables ==
 +
All nftable commands are done with the nft ultility from {{Package|net-firewall/nftables}}.
 +
===Tables===
 +
====Creating tables====
 +
The following command adds a table called filter for the ip(v4) layer
 +
<console>
 +
###i## nft add table ip filter
 +
</console>
 +
Likewise a table for arp can be created with
 +
<console>
 +
###i## nft add table arp filter
 +
</console>
 +
{{Note|The name "filter" used here is completly arbitrary. It could have any name}}
 +
====Listing tables====
 +
The following command lists all tables for the ip(v4) layer
 +
<console>
 +
###i## nft list tables ip
 +
</console>
 +
<pre>
 +
table filter
 +
</pre>
 +
The contents of the table filter can be listed with:
 +
<console>
 +
###i## nft list table ip filter
 +
</console>
 +
<pre>
 +
table ip filter {
 +
        chain input {
 +
                type filter hook input priority 0;
 +
                ct state established,related accept
 +
                iifname "lo" accept
 +
                ip protocol icmp accept
 +
                drop
 +
        }
 +
}
 +
</pre>
 +
using -a with the nft command, it shows the handle of each rule. Handles are used for various operations on specific rules:
 +
<console>
 +
###i## nft -a list table ip filter
 +
</console>
 +
<pre>
 +
table ip filter {
 +
        chain input {
 +
                type filter hook input priority 0;
 +
                ct state established,related accept # handle 2
 +
                iifname "lo" accept # handle 3
 +
                ip protocol icmp accept # handle 4
 +
                drop # handle 5
 +
        }
 +
}
 +
</pre>
  
=== Let the freakiness begin ===
+
====Deleting tables====
 +
The following command deletes the table called filter for the ip(v4) layer:
 +
<console>
 +
###i## nft delete table ip filter
 +
</console>
 +
===chains===
 +
====Adding chains====
 +
The following command adds a chain called input to the ip filter table and registered to the input hook with priority 0. It is of the type filter.
 +
<console>
 +
###i## nft add chain ip filter input { type filter hook input priority 0 \; }
 +
</console>
 +
{{Note|If You're running this command from Bash you need to escape the semicolon}}
 +
A non-base chain can be added by not specifying the chain configurations between the curly braces.
  
I sent an e-mail to the marketing manager at Cygnus to let them know our intentions, expecting a "yeah, go for it, thanks for using our compiler" response. Instead the reply was that although we were (technically) allowed to use the Cygnus compiler, we were strongly urged not to use or include the compiler sources with Enoch. I responded by asking why they had released the source under the GPL, if that was the case. It's my guess that if they had a choice, they wouldn't have used the GPL, but because they derived their compiler from egcs (released under the GPL), they had no choice.
+
====Removing chains====
 +
The following command deletes the chain called input
 +
<console>
 +
###i## nft delete chain ip filter input
 +
</console>
 +
{{Note|Chains can only be deleted if there are no rules in them.}}
 +
===rules===
 +
====Adding rules====
 +
The following command adds a rule to the chain called input, on the ip filter table, dropping all traffic to port 80:
 +
<console>
 +
###i## nft add rule ip filter input tcp dport 80 drop
 +
</console>
 +
====Deleting Rules====
 +
To delete a rule, you first need to get the handle number of the rule. This can be done by using the -a flag on nft:
 +
<console>
 +
###i## nft  rule ip filter input tcp dport 80 drop
 +
</console>
 +
<pre>
 +
table ip filter {
 +
        chain input {
 +
                type filter hook input priority 0;
 +
                tcp dport http drop # handle 2
 +
        }
 +
}
 +
</pre>
 +
It is then possible to delete the rule with:
 +
<console>
 +
###i## nft delete rule ip filter input handle 2
 +
</console>
 +
== Management ==
 +
=== Backup ===
 +
You can also backup your rules:
 +
<console>
 +
###i## echo "nft flush ruleset" > backup.nft
 +
</console>
  
This is a good example of a situation where the GPL prevented a company from creating a proprietary product based on open sources. My educated guess is that Cygnus was afraid that if we used their compiler we would undermine their boxed product sales, which would be especially strange because none of their marketing materials (nor the InfoWorld review) mentioned the new compiler included with CodeFusion. CodeFusion was marketed solely as a "development IDE" product, not as a compiler.
+
<console>
 +
###i## nft list ruleset >> backup.nft
 +
</console>
  
In an attempt to put some of their paranoia to rest, I offered to endorse CodeFusion and place the endorsement on our Web site with a link to help spur CodeFusion sales. Personally I didn't think that a "turbo" Enoch would negatively affect their sales, since CodeFusion was marketed as an IDE. But I tried nevertheless to make them happy. The IDE component of CodeFusion was a commercial product, and we had no desire or intention (or right) to distribute it with Enoch.
+
=== Restoration ===
 +
And load it atomically:
 +
<console>
 +
###i## nft -f backup.nft
 +
</console>
  
I e-mailed my (generous?) offer to Cygnus and received another strange response. They wanted authority over all of our "marketing materials" (apparently, this also included the content of our Web site!) Another shocker. The Cygnus marketing team seemed to have no grasp of how the Linux community or the GPL worked, so I decided to cut off communication with Cygnus for the indefinite future. In the mean time, we created a private "turbo" and public "non-turbo" version of Enoch, leaving the final decision for later.
+
== OpenRC configuration ==
  
But after several months they integrated the CodeFusion x86 backend into gcc 2.95.2. Now everyone could benefit from the nice new backend, not just the people who knew about the "secret GPL compiler" included on the CodeFusion CD. But we decided to go ahead and use gcc rather than the CodeFusion compiler. In addition to being more stable, gcc 2.95.2 also allowed us avoid Cygnus, which by this time had been purchased by RedHat for a ridiculous sum of money. (Note: the new x86 backend in gcc 2.95.2 is what gave newer Linux distributions the significant speed boost that we all got to experience. It also gave FreeBSD 4.0 a nice speed boost over 3.3.6. Notice the difference?)
+
Don't forget to add nftables service to startup:
 +
<console>
 +
###i## rc-update add nftables default
 +
</console>
 +
== Init script - firewall nftables like a firewall iptables ==
 +
<pre>
 +
#!/sbin/runscript
 +
#      Raphael Bastos aka coffnix        #
 +
#      Init Script for Funtoo Linux     #
 +
##########################################
  
=== On the soapbox ===
+
depend() {
 +
        need net
 +
        need nftables
 +
        }
  
Thanks to this and other experiences, I've learned a lot about for-profit open source companies. There's absolutely nothing bad about being a for-profit open source company. Nor is there anything morally wrong with producing proprietary closed-source software, if that's what you'd like to do. But it doesn't make any sense for open source companies to subvert or refuse to cooperate with the rest of the open source world, either by not supporting the GPL or by any other means. This is a practical point that clearly makes business sense.
+
start(){
 +
##################### PARTE 1 #####################
 +
ebegin "Starting Firewall NFTables"
  
Open source companies should realize that the free exchange of ideas and code is what they profit from. By opposing things like the standard GPL practices, they undermine the environment they rely upon to prosper and grow. If open source is the soil from which your business has sprouted, it makes sense to keep the soil healthy.
+
#######################################################################
 +
### Incompatibilities ###
 +
# You cannot use iptables and nft to perform NAT at the same time.
 +
# So make sure that the iptable_nat module is unloaded
 +
rmmod iptable_nat
  
I understand that there's a temptation to keep at least some information secret for short-term financial gain. Advanced code or special techniques provide a coveted competitive advantage, which could potentially result in increased sales and profit. But if the goal is to be the sole provider of a product, the product should be commercial rather than open source. Open source does not allow for exclusive access to the inner workings of anything. That's what it means.
+
#######################################################################
  
=== Back to Enoch ===
+
echo 1 > /proc/sys/net/ipv4/ip_forward
 +
echo 1 > /proc/sys/net/ipv4/ip_dynaddr
 +
echo 1 > /proc/sys/net/ipv4/conf/all/rp_filter
 +
for f in /proc/sys/net/ipv4/conf/*/rp_filter ; do echo 1 > $f ; done
  
Now, I'll step down from my soapbox and continue my story.
+
#######################################################################
  
As Enoch became more and more refined, we decided that a name change was in order, and "Gentoo Linux" was born. By this time we had released a couple of versions of Enoch (now Gentoo), and were racing to get to Gentoo Linux version 1.0. Around this time I also decided to upgrade my old Celeron 300 box (overclocked and rock-solid at 450Mhz) to a brand-new Abit BP6 (a dual Celeron board that had just hit the market). I sold my old box and put my dual Celeron 366 system together. After overclocking the processors to something on the order of 500Mhz, I was cruising. But I noticed that my new machine wasn't very stable.
+
iptables -t nat -F
  
Obviously my first reaction was to go back down to 2x366Mhz. But now I experienced an even stranger problem. As long as my machine kept the CPUs chugging away, the machine didn't lock up. But if I left the machine idle overnight, there was a good probability that the system would lock up completely. Yes, an idle bug -- argh! After some research, I found several other Linux users with the same problem on this particular motherboard. A chip on the BP6 (was it the PCI controller?) seemed to be flaky or out of spec, which caused Linux to lock up at idle.
+
#######################################################################
  
I was more than a wee bit upset, and because I couldn't afford to order more PC parts, Gentoo development effectively halted. I became more and more pessimistic about Linux and decided to switch over to FreeBSD. Yes, FreeBSD. And that's where I'll end this installment -- see you in Part 3. :)
+
# ipv4
{{ArticleFooter}}
+
nft -f /etc/nftables/ipv4-filter
 +
 
 +
# ipv4 nat
 +
nft -f /etc/nftables/ipv4-nat
 +
 
 +
# ipv6
 +
nft -f /etc/nftables/ipv6-filter
 +
 
 +
# Rules firewall NTFtables
 +
nft -f /etc/nftables/firewall.rules
 +
 
 +
#######################################################################
 +
 
 +
}
 +
 
 +
stop(){
 +
ebegin "Stoping Firewall NFTables"
 +
 
 +
#######################################################################
 +
 
 +
#iptables -t nat -F
 +
NFT=nft
 +
FAMILIES="ip ip6 arp bridge"
 +
 
 +
for FAMILY in $FAMILIES; do
 +
  TABLES=$($NFT list tables $FAMILY | grep "^table\s" | cut -d' ' -f2)
 +
 
 +
  for TABLE in $TABLES; do
 +
    CHAINS=$($NFT list table $FAMILY $TABLE | grep "^\schain\s" | cut -d' ' -f2)
 +
 
 +
    for CHAIN in $CHAINS; do
 +
      echo "Flushing chain: $FAMILY->$TABLE->$CHAIN"
 +
      $NFT flush chain $FAMILY $TABLE $CHAIN
 +
      $NFT delete chain $FAMILY $TABLE $CHAIN
 +
    done
 +
 
 +
    echo "Flushing table: $FAMILY->$TABLE"
 +
    $NFT flush table $FAMILY $TABLE
 +
    $NFT delete table $FAMILY $TABLE
 +
  done
 +
done
 +
}
 +
 
 +
status(){
 +
nft list ruleset
 +
}
 +
 
 +
# End
 +
</pre>
 +
 
 +
[[Category:System]]
 +
[[Category:First Steps]]
 +
{{EbuildFooter}}

Revision as of 14:22, February 22, 2015

net-firewall/nftables


Source Repository:Gentoo Portage Tree

Summary: Linux kernel (3.13+) firewall, NAT and packet mangling tools


News

Drobbins

RSS/Atom Support

You can now follow this news feed at http://www.funtoo.org/news/atom.xml .
10 February 2015 by Drobbins
Drobbins

Creating a Friendly Funtoo Culture

This news item details some recent steps that have been taken to help ensure that Funtoo is a friendly and welcoming place for our users.
2 February 2015 by Drobbins
Mgorny

CPU FLAGS X86

CPU_FLAGS_X86 are being introduced to group together USE flags managing CPU instruction sets.
31 January 2015 by Mgorny
View More News...

Nftables

Tip

This is a wiki page. To edit it, Create a Funtoo account. Then log in and then click here to edit this page. See our editing guidelines to becoming a wiki-editing pro.

What is nftables?

nftables is the successor to iptables. It replaces the existing iptables, ip6tables, arptables and ebtables framework. It uses the Linux kernel and a new userspace utility called nft. nftables provides a compatibility layer for the ip(6)tables and framework.

Introduction

As with the iptables framework, nftables is build upon rules which specify the actions. These rules are attached to chains. A chain can contain a collection of rules and is registered into the netfilter hooks. Chains are stored inside tables. A table is specific for one of the layer 3 protocols. One of the main differences with iptables is that there are no predefined tables and chains anymore.

Tables

A table is nothing more than a container for your chains. With nftables there are no predefined tables (filter, raw, mangle...) anymore. You are free to recreate the iptables-like structure, but anything might do. Currently there are 5 different families of tables:

  • ip: Used for IPv4 related chains;
  • ip6: Used for IPv6 related chains;
  • arp: Used for ARP related chains;
  • bridge: Used for bridging related chains;
  • inet: Mixed ipv4/ipv6 chains (kernel 3.14 and up).

It is not hard to recognize the old tables framework in these tables. The only new one is the inet table which is used for both IPv4 and IPv6 traffic. It should make firewalling for dual-stack hosts easier by combining the rules for IPv4 and IPv6.

Chains

Chains are used to group together rules. As with the tables, nftables does not have any predefined chains. Chains are grouped in base and non-base types. Base chains are registered in one of the netfilter hooks. A base chain has a hook its registered with, a type and a priority. Non-base chains are not attached to a hook and they don't see any traffic by default. They can be used to arrange a rule-set in a tree of chains. There are currently three types of chains:

  • filter: for filtering packets
  • route: for rerouting packets
  • nat: for performing Network Address Translation. Only the first packet of a flow hits this chain, making it impossible to use it for filtering.

The hooks that can be used are:

  • prerouting: This is before the routing decision, all packets entering the machine hits this chain
  • input: All packets for the local system hits this hook
  • forward: Packets not for the local system, those that need to be forwarded hits this hook
  • output: Packets that originate from the local system pass this hook
  • postrouting: This hook is after the routing decision, all packets leaving the machine hits this chain
Note

The ARP address family only supports the input and output hook

Note

The bridge address family only seems to supports the input, forward and output hook

Priorities

Note
Priorities do not currently appear to have any effect on which chain sees packets first.
Note
Since the priority seems to be an unsigned integer, negative priorities will be converted into very high priorities.

Rules

Rules specify which action has to be taken for which packets. Rules are attached to chains. Each rule can has an expression to match packets with and one or multiple actions when matching. Main differences with iptables is that it is possible to specify multiple actions and that by default counters are off. It must be specified explicitly in rules if you want packet- and byte-counters for a rule. Each rule has a unique handle number by which it can be distinguished. The following matches are available:

  • ip: IP protocol
  • ip6: IPv6 protocol
  • tcp: TCP protocol
  • udp: UDP protocol
  • udplite: UDP-lite protocol
  • sctp: SCTP protocol
  • dccp: DCCP protocol
  • ah: Authentication headers
  • esp: Encrypted security payload headers
  • ipcomp: IPcomp headers
  • icmp: icmp protocol
  • icmpv6: icmpv6 protocol
  • ct: Connection tracking
  • meta: meta properties such as interfaces

Matches

Match Arguments Description/Example
ip version Ip Header version
hdrlength IP header length
tos Type of Service
length Total packet length
id IP ID
frag-off Fragmentation offset
ttl Time to live
protocol Upper layer protocol
checksum IP header checksum
saddr Source address
daddr Destination address
ip6 version IP header version
priority
flowlabel Flow label
length Payload length
nexthdr Next header type (Upper layer protocol number)
hoplimit Hop limit
saddr Source Address
daddr Destination Address
tcp sport Source port
dport Destination port
sequence Sequence number
ackseq Acknowledgement number
doff Data offset
flags TCP flags
window Window
checksum Checksum
urgptr Urgent pointer
udp sport Source port
dport destination port
length Total packet length
checksum Checksum
udplite sport Source port
dport destination port
cscov Checksum coverage
checksum Checksum
sctp sport Source port
dport destination port
vtag Verification tag
checksum Checksum
dccp sport Source port
dport destination port
ah nexthdr Next header protocol (Upper layer protocol)
hdrlength AH header length
spi Security Parameter Index
sequence Sequence Number
esp spi Security Parameter Index
sequence Sequence Number
ipcomp nexthdr Next header protocol (Upper layer protocol)
flags Flags
cfi Compression Parameter Index
icmp type icmp packet type
icmpv6 type icmpv6 packet type
ct state State of the connection
direction Direction of the packet relative to the connection
status Status of the connection
mark Connection mark
expiration Connection expiration time
helper Helper associated with the connection
l3proto Layer 3 protocol of the connection
saddr Source address of the connection for the given direction
daddr Destination address of the connection for the given direction
protocol Layer 4 protocol of the connection for the given direction
proto-src Layer 4 protocol source for the given direction
proto-dst Layer 4 protocol destination for the given direction
meta length Length of the packet in bytes: meta length > 1000
protocol ethertype protocol: meta protocol vlan
priority TC packet priority
mark Packet mark
iif Input interface index
iifname Input interface name
iiftype Input interface type
oif Output interface index
oifname Output interface name
oiftype Output interface hardware type
skuid UID associated with originating socket
skgid GID associated with originating socket
rtclassid Routing realm

Statements

Statements represent the action to be performed when the rule matches. They exist in two kinds: Terminal statements, unconditionally terminate the evaluation of the current rules and non-terminal statements that either conditionally or never terminate the current rules. There can be an arbitrary amount of non-terminal statements, but there must be only a single terminal statement. The terminal statements can be:

  • accept: Accept the packet and stop the ruleset evaluation.
  • drop: Drop the packet and stop the ruleset evaluation.
  • reject: Reject the packet with an icmp message
  • queue: Queue the packet to userspace and stop the ruleset evaluation.
  • continue:
  • return: Return from the current chain and continue at the next rule of the last chain. In a base chain it is equivalent to accept
  • jump <chain>: Continue at the first rule of <chain>. It will continue at the next rule after a return statement is issued
  • goto <chain>: Similar to jump, but after the new chain the evaluation will continue at the last chain instead of the one containing the goto statement

Installing nftables

Kernel

These kernel options must be set:

[*] Networking support  --->
   Networking options  --->
       [*] Network packet filtering framework (Netfilter)  --->
           Core Netfilter Configuration  --->
               <M> Netfilter nf_tables support
               <M>   Netfilter nf_tables IPv6 exthdr module
               <M>   Netfilter nf_tables meta module
               <M>   Netfilter nf_tables conntrack module
               <M>   Netfilter nf_tables rbtree set module
               <M>   Netfilter nf_tables hash set module
               <M>   Netfilter nf_tables counter module
               <M>   Netfilter nf_tables log module
               <M>   Netfilter nf_tables limit module
               <M>   Netfilter nf_tables nat module
               <M>   Netfilter x_tables over nf_tables module
           IP: Netfilter Configuration  --->
               <M> IPv4 nf_tables support
               <M>   nf_tables IPv4 reject support
               <M>   IPv4 nf_tables route chain support
               <M>   IPv4 nf_tables nat chain support
           IPv6: Netfilter Configuration  --->
               <M> IPv6 nf_tables support
               <M>   IPv6 nf_tables route chain support
               <M>   IPv6 nf_tables nat chain support
           <M>   Ethernet Bridge nf_tables support

Emerging

To install nftables, run the following command:

# emerge net-firewall/nftables


OpenRC configuration

Don't forget to add nftables service to startup:

# rc-update add nftables default

You cannot use iptables and nft to perform NAT at the same time. So make sure that the iptable_nat module is unloaded. Remove iptables_nat module:

# rmmod iptable_nat

Start nftables:

# /etc/init.d/nftables start


Using nftables

All nftable commands are done with the nft ultility from Nftables.

Tables

Creating tables

The following command adds a table called filter for the ip(v4) layer

# nft add table ip filter

Likewise a table for arp can be created with

# nft add table arp filter
Note

The name "filter" used here is completly arbitrary. It could have any name

Listing tables

The following command lists all tables for the ip(v4) layer

# nft list tables ip
table filter

The contents of the table filter can be listed with:

# nft list table ip filter
table ip filter {
        chain input {
                 type filter hook input priority 0;
                 ct state established,related accept
                 iifname "lo" accept
                 ip protocol icmp accept
                 drop
        }
}

using -a with the nft command, it shows the handle of each rule. Handles are used for various operations on specific rules:

# nft -a list table ip filter
table ip filter {
        chain input {
                 type filter hook input priority 0;
                 ct state established,related accept # handle 2
                 iifname "lo" accept # handle 3
                 ip protocol icmp accept # handle 4
                 drop # handle 5
        }
}

Deleting tables

The following command deletes the table called filter for the ip(v4) layer:

# nft delete table ip filter

chains

Adding chains

The following command adds a chain called input to the ip filter table and registered to the input hook with priority 0. It is of the type filter.

# nft add chain ip filter input { type filter hook input priority 0 \; }
Note

If You're running this command from Bash you need to escape the semicolon

A non-base chain can be added by not specifying the chain configurations between the curly braces.

Removing chains

The following command deletes the chain called input

# nft delete chain ip filter input
Note

Chains can only be deleted if there are no rules in them.

rules

Adding rules

The following command adds a rule to the chain called input, on the ip filter table, dropping all traffic to port 80:

# nft add rule ip filter input tcp dport 80 drop

Deleting Rules

To delete a rule, you first need to get the handle number of the rule. This can be done by using the -a flag on nft:

# nft  rule ip filter input tcp dport 80 drop
table ip filter {
        chain input {
                 type filter hook input priority 0;
                 tcp dport http drop # handle 2
        }
}

It is then possible to delete the rule with:

# nft delete rule ip filter input handle 2

Management

Backup

You can also backup your rules:

# echo "nft flush ruleset" > backup.nft
# nft list ruleset >> backup.nft

Restoration

And load it atomically:

# nft -f backup.nft

OpenRC configuration

Don't forget to add nftables service to startup:

# rc-update add nftables default

Init script - firewall nftables like a firewall iptables

#!/sbin/runscript
#      Raphael Bastos aka coffnix        #
#      Init Script for Funtoo Linux      #
##########################################

depend() {
        need net
        need nftables
        }

start(){
##################### PARTE 1 #####################
ebegin "Starting Firewall NFTables"

#######################################################################
### Incompatibilities ###
# You cannot use iptables and nft to perform NAT at the same time.
# So make sure that the iptable_nat module is unloaded
rmmod iptable_nat

#######################################################################

echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_dynaddr
echo 1 > /proc/sys/net/ipv4/conf/all/rp_filter
for f in /proc/sys/net/ipv4/conf/*/rp_filter ; do echo 1 > $f ; done

#######################################################################

iptables -t nat -F

#######################################################################

# ipv4
nft -f /etc/nftables/ipv4-filter

# ipv4 nat
nft -f /etc/nftables/ipv4-nat

# ipv6
nft -f /etc/nftables/ipv6-filter

# Rules firewall NTFtables
nft -f /etc/nftables/firewall.rules

#######################################################################

}

stop(){
ebegin "Stoping Firewall NFTables"

#######################################################################

#iptables -t nat -F
NFT=nft
FAMILIES="ip ip6 arp bridge"

for FAMILY in $FAMILIES; do
  TABLES=$($NFT list tables $FAMILY | grep "^table\s" | cut -d' ' -f2)

  for TABLE in $TABLES; do
    CHAINS=$($NFT list table $FAMILY $TABLE | grep "^\schain\s" | cut -d' ' -f2)

    for CHAIN in $CHAINS; do
      echo "Flushing chain: $FAMILY->$TABLE->$CHAIN"
      $NFT flush chain $FAMILY $TABLE $CHAIN
      $NFT delete chain $FAMILY $TABLE $CHAIN
    done

    echo "Flushing table: $FAMILY->$TABLE"
    $NFT flush table $FAMILY $TABLE
    $NFT delete table $FAMILY $TABLE
  done
done
}

status(){
nft list ruleset
}

# End