Difference between pages "Slowloris DOS Mitigation Guide" and "Traffic Control"

From Funtoo
(Difference between pages)
Jump to navigation Jump to search
 
 
Line 4: Line 4:
== Introduction ==
== Introduction ==


[http://ha.ckers.org/slowloris/ Slowloris] is the name of a perl-based HTTP client that can be used as a denial of service against Apache-based HTTP servers and the squid caching proxy server. It operates by repeatedly initiating several hundred valid HTTP requests to the server, and keeping these connections open using a minimal amount of TCP traffic, in order to consume server resources. Once server resources are exhausted, the server will no longer be able to respond to legitimate traffic.
Linux's traffic control functionality offers a lot of capabilities related to influencing the rate of flow, as well as latency, of primarily outgoing but also in some cases incoming network traffic. It is designed to be a "construction kit" rather than a turn-key system, where complex network traffic policing and shaping decisions can be made using a variety of algorithms. The Linux traffic control code is also often used by academia for research purposes, where is it can be a useful mechanism to simulate and explore the impact of a variety of different network behaviors. See [http://www.linuxfoundation.org/collaborate/workgroups/networking/netem netem] for an example of a simulation framework that can be used for this purpose.  


Slowloris was written by 'RSnake', and was announced in a [http://ha.ckers.org/blog/20090617/slowloris-http-dos/ ha.ckers.org blog] post on June 17, 2009.
Of course, Linux traffic control can also be extremely useful in an IT context, and this document is intended to focus on the practical, useful applications of Linux traffic control, where these capabilities can be applied to solve problems that are often experienced on modern networks.


As of July 5, 2009, vulnerable HTTP servers and proxies include:
== Incoming and Outgoing Traffic ==


*'''Apache HTTP Server'''
One common use of Linux traffic control is to configure a Linux system as a Linux router or bridge, so that the Linux system sits between two networks, or between the "inside" of the network and the real router, so that it can shape traffic going to local machines as well as out to the Internet. This provides a way to prioritize, shape and police both incoming (from the Internet) and outgoing (from local machines) network traffic, because it is easiest to create traffic control rules for traffic flowing ''out'' of an interface, since we can control when the system ''sends'' data, but controlling when we ''receive'' data requires an additional ''intermediate queue'' to be created to buffer incoming data. When a Linux system is configured as a firewall or router with a physical interface for each part of the network, we can avoid using intermediate queues.
*'''IBM HTTP Server'''
*'''IBM WebSphere Edge Server Caching Proxy'''
*'''Squid caching proxy server'''


There has been much discussion on the Internet relating to what HTTP servers, HTTP proxies, and network configurations are not affected by Slowloris.
A simple way to set up a layer 2 bridge using Linux involves creating a bridge device with <tt>brctl</tt>, adding two Ethernet ports to this bridge (again using <tt>brctl</tt>), and then apply prioritization, shaping and policing rules to both interfaces. The rules will apply to ''outgoing'' traffic on each interface. One physical interface will be connected to an upstream router on the same network, while the other network port will be connected to a layer 2 access switch to which local machines are connected. This allows powerful egress shaping policies to be created on both interfaces, to control the flows in and out of the network.


It's important to note that, based on our testing, much of the conventional wisdom about supposedly non-vulnerable configurations is misleading at best. One of the primary goals of this document is to dispel some of these myths and provide reliable information on properly mitigating against Slowloris and other similar denials of service.
== Recommended Resources ==


In particular, it's important to note that hardware load balancers typically do not protect against this denial of service without additional configuration, which we detail below. In addition, other supposedly non-vulnerable HTTP servers and proxies can be affected by this denial of service using non-default Slowloris settings.
Resources you should take a look at, in order:


{{fancyimportant|Networks that utilize hardware load balancers and alternative Web servers may still be vulnerable to Slowloris.}}
* [http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm HTB documentation] by Martin Devera. Best way to create different priority classes and bandwidth allocations.
* [http://www.opalsoft.net/qos/DS.htm Differentiated Services On Linux HOWTO] by Leonardo Balliache. Good general docs.
* [http://blog.edseek.com/~jasonb/articles/traffic_shaping/index.html A Practical Guide to Linux Traffic Control] by Jason Boxman. Good general docs.
* [http://www.linuxfoundation.org/collaborate/workgroups/networking/ifb IFB - replacement for Linux IMQ], with examples. This is the official best way to do ''inbound'' traffic control, when you don't have dedicated in/out interfaces.
* [http://seclists.org/fulldisclosure/2006/Feb/702 Use of iptables hashlimit] - Great functionality in iptables. There's a hashlimit example below as well.


=== What Makes Slowloris "Different" ===
Related Interesting Links:


Before we review Slowloris mitigations, let's review what makes Slowloris different from other denials of service.
* [http://wiki.secondlife.com/wiki/BLT Second Life Bandwidth Testing Protocol] - example of Netem
* [http://www.29west.com/docs/THPM/udp-buffer-sizing.html UDP Buffer Sizing], part of [http://www.29west.com/docs/THPM/index.html Topics in High Performance Messaging]


Slowloris is different from typical denials of service in that Slowloris traffic utilizes legitimate HTTP traffic, and does not rely on using special "bad" HTTP requests that exploit bugs in specific HTTP servers. Because of this, existing IPS and IDS solutions that rely on signatures to detect attacks will generally not recognize Slowloris. This means that Slowloris is capable of being effective even when standard enterprise-grade IPS and IDS systems are in place.
== Recommended Approaches ==


The second issue that makes Slowloris different is that it is an easy-to-use perl script. While similar denials of service have been documented in security publications, RSnake has provided a "weaponized" ready-to-use version of this denial of service that is trivial to use.
Daniel Robbins has had very good results with the [http://luxik.cdi.cz/~devik/qos/htb/ HTB queuing discipline] - it has very good features, and also has [http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm very good documentation], which is just as important, and is designed to deliver useful results in a production environment. And it works. If you use traffic control under Funtoo Linux, please use the HTB queuing discipline as the root queuing discipline because you will get good results in very little time. Avoid using any other queuing discipline under Funtoo Linux as the ''root'' queuing discipline on any interface. If you are creating a tree of classes and qdiscs, HTB should be at the top, and you should avoid hanging classes under any other qdisc unless you have plenty of time to experiment and verify that your QoS rules are working as expected. Please see [[#State_of_the_Code|State of the Code]] for more info on what Daniel Robbins considers to be the current state of the traffic control implementation in Linux.


Combine these two facts together, plus throw in the fact that Apache is vulnerable against Slowloris, and script kiddies now have an easy way to take down a large portion of the Internet.
== State of the Code ==


== Hardware Load Balancers ==
If you are using enterprise kernels, especially any RHEL5-based kernels, you must be aware that the traffic control code in these kernels is about 5 years old and contains many significant bugs. In general, it is possible to avoid these bugs by using HTB as your root queueing discipline and testing things carefully to ensure that you are getting the proper behavior. The <tt>prio</tt> queueing discipline is known to not work reliably in RHEL5 kernels. See [[Broken Traffic Control]] for more information on known bugs with older kernels.


=== Introduction ===
If you are using a more modern kernel, Linux traffic control should be fairly robust. The examples below should work with RHEL5 as well as newer kernels.
As we mentioned earlier, conventional wisdom is that if your infrastructure is behind a hardware load balancer, then you are not vulnerable to Slowloris. This is not true. Slowloris can traverse hardware load balancers even if they are properly configured. However, many load balancers can be configured to protect your infrastructure against Slowloris. Here's how to do it.


=== Load Balancer Mitigation ===
== Inspect Your Rules ==
Many hardware load balancers, including F5 Big-IP, Cisco CSS, Cisco ACE and Citrix NetScaler have a feature generically known as delayed binding, or TCP Splicing. This feature allows the load balancer to allow a TCP three-way handshake between the client and the virtual IP address (a.k.a. the hardware load balancer) configured in front of the Web server(s). After this handshake has been completed, the client will send in the HTTP request header, which the load balancer can inspect to determine what action to perform on the HTTP request. The load balancer can be configured to respond to the client in a number of ways, such as sending a HTTP 302 Redirect, or selecting an appropriate Web server to handle the HTTP Request based upon something identifiable in the HTTP Request header like a cookie, URL or User-Agent.


Delayed binding typically causes the load balancer to perform an HTTP Request header completeness check, which means that the HTTP Request will not be sent to the appropriate Web server until the final two carriage return and line feeds are sent by the HTTP client. This is the key bit of information. Basically, delayed binding ensures that your Web server or proxy will never see any of the incomplete requests being sent out by Slowloris. Because of this, delayed binding is a very effective way to protect against Slowloris, but it must be properly configured. We'll look at an example configuration next.
If you are implementing Linux traffic control, you should be running these commands frequently to monitor the behavior of your queuing discipline. Replace <tt>$wanif</tt> with the actual network interface name.


=== Cisco CSS Configuration ===
<source lang="bash">
Here is a Cisco CSS configuration that enables delayed binding:
tc -s qdisc ls dev $wanif
<pre>
tc -s class ls dev $wanif
content www_80_rule
</source>
        vip address 10.5.154.200
        protocol tcp
        port 80
        add service wwwserver1_80
        add service wwwserver2_80
        url "/*"
        active
</pre>
The second to last line ('url "/*"') is what enables a Layer5 rule that performs delayed binding on the Cisco CSS, ''and without this enabled in the content rule, any Apache based HTTP Server that receives an incomplete HTTP Header will be vulnerable to this exploit''.


Other load balancers, such as F5 Big-IP, can be configured to protect against Slowloris using a similar approach.
== Matching ==


{{fancyimportant|This hardware load balancer mitigation only works with HTTP traffic. Please read below to understand how to protect against SSL-based attacks}}
Here are some examples you can use as the basis for your own filters/classifiers:
.
=== Load Balancers and SSL ===
We've addressed the HTTP side of the house, but what about HTTPS (SSL)? If you have an environment where you must run end to end encryption and are running an Apache-based HTTP server with an SSL Listener, '''then it's vulnerable. A Hardware load balancer placed in front of your Apache-based SSL Listener does not mitigate this vulnerability'''.


Here is a typical scenario:
# <tt>protocol arp u32 match u32 0 0</tt> - match ARP packets
# <tt>protocol ip u32 match ip protocol 0x11 0xff</tt> - match UDP packets
# <tt>protocol ip u32 match ip protocol 17 0xff</tt> - (also) match UDP packets
# <tt>protocol ip u32 match ip protocol 0x6 0xff</tt> - match TCP packets
# <tt>protocol ip u32 match ip protocol 1 0xff</tt> - match ICMP (ping) packets
# <tt>protocol ip u32 match ip dst 4.3.2.1/32</tt> - match all IP traffic headed for IP 4.3.2.1
# <tt>protocol ip u32 match ip src 4.3.2.1/32 match ip sport 80 0xffff</tt> - match all IP traffic from 4.3.2.1 port 80
# <tt>protocol ip u32 match ip sport 53 0xffff</tt> - match originating DNS (both TCP and UDP)
# <tt>protocol ip u32 match ip dport 53 0xffff</tt> - match response DNS (both TCP and UDP)
# <tt>protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13</tt> - match packets with ACK bit set
# <tt>protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13 match u16 0x0000 0xffc0 at 2</tt> - packets less than 64 bytes in size with ACK bit set
# <tt>protocol ip u32 match ip tos 0x10 0xff</tt> - match IP packets with "type of service" set to "Minimize delay"/"Interactive"
# <tt>protocol ip u32 match ip tos 0x08 0xff</tt> - match IP packets with "type of service" set to "Maximize throughput"/"Bulk" (see "QDISC PARAMETERS" in <tt>tc-prio</tt> man page)
# <tt>protocol ip u32 match tcp dport 53 0xffff match ip protocol 0x6 0xff</tt> - match TCP packets heading for dest. port 53 (my not work)


*Client attempts HTTPS connection to Web site.
== Sample Traffic Control Code ==
*Load balancer 'fronts' the Web site and may be able to detect the SSL Session ID, but cannot inspect SSL traffic, and therefore has no further HTTP Protocol inspection capability.
*The Hardware Load Balancer sends the HTTPS connection back to the Apache-based HTTP server, performs a SSL Handshake and then sends the HTTP Data inside the SSL connection.


If the HTTP Request employs the exploit, the Apache-based HTTP Server worker threads will be exploited. Since this is done over SSL, you are limited in your ability to inspect and detect the exploit using the hardware load balancer.
<source lang="bash">
modemif=eth4


Since the hardware load balancer must be able to inspect the HTTP Request Header in order to perform delayed binding and protect against Slowloris, it first must be able to see the HTTP Data. To do this, your hardware must be configured to decrypt the SSL and then perform the delayed binding, which we previously covered.
iptables -t mangle -A POSTROUTING -o $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 53 -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10


This configuration can be accomplished with back-end SSL configuration on a Cisco CSS, or by utilizing a similar feature on a different hardware load balancer. This does require that your load balancer decrypt incoming SSL traffic using encryption offload.
tc qdisc add dev $modemif root handle 1: htb default 12
tc class add dev $modemif parent 1: classid 1:1 htb rate 1500kbit ceil 1500kbit burst 10k
tc class add dev $modemif parent 1:1 classid 1:10 htb rate 700kbit ceil 1500kbit prio 1 burst 10k
tc class add dev $modemif parent 1:1 classid 1:12 htb rate 800kbit ceil 800kbit prio 2
tc filter add dev $modemif protocol ip parent 1:0 prio 1 u32 match ip protocol 0x11 0xff flowid 1:10
tc qdisc add dev $modemif parent 1:10 handle 20: sfq perturb 10
tc qdisc add dev $modemif parent 1:12 handle 30: sfq perturb 10
</source>


=== Hardware Load Balancer Summary ===
The code above is a working traffic control script that is even compatible with RHEL5 kernels, for a 1500kbps outbound link (T1, Cable or similar.) In this example, <tt>eth4</tt> is part of a bridge. The code above should work regardless of whether <tt>eth4</tt> is in a bridge or not -- just make sure that <tt>modemif</tt> is set to the interface on which traffic is flowing ''out'' and you wish to apply traffic control.
Hardware load balancers do not protect against Slowloris by default. However, when configured to perform delayed binding, hardware load balancers can be a very effective protection for HTTP. However, for HTTPS, things become more complex, as the hardware load balancer is limited in its ability to inspect traffic. This problem can be addressed by configuring your load balancer to decrypt SSL traffic so that the HTTP request can be inspected and delayed binding can be performed.


== Apache ==
=== <tt>tc</tt> code walkthrough ===


We've looked at how to configure load balancers, what what about fixing Apache itself? The problem with Apache in its current incarnation is that it can be taken offline by a relatively mild Slowloris attack. Certainly, Apache developers must be working on an imminent fix for this issue -- or are they?
This script uses the <tt>tc</tt> command to create two priority classes - 1:10 and 1:12. By default, all traffic goes into the low-priority class, 1:12. 1:10 has priority over 1:12 (<tt>prio 1</tt> vs. <tt>prio 2</tt>,) so if there is any traffic in 1:10 ready to be sent, it will be sent ahead of 1:12. 1:10 has a rate of 700kbit but can use up to the full outbound bandwidth of 1500kbit by borrowing from 1:12.  


Reaction from Apache developers has been mixed. Take for example the response quoted in the [http://ha.ckers.org/blog/20090617/slowloris-http-dos/ ha.ckers.org blog post], where this issue is not taken seriously, as well as the official apache.org Slowloris bug, where this issue was described as "INVALID". Fortunately, some Apache developers are being more honest about the issue, such as in an apache-dev post from Paul Querna, Apache committer, about this issue:
UDP traffic (traffic that matches <tt>ip protocol 0x11 0xff</tt>) will be put in the high priority class 1:10. This can be good for things like FPS games, to ensure that latency is low and not drowned out by lower-priority traffic.
<pre>
Mitagation is the wrong approach.


We all know our architecture is wrong.
If we stopped here, however, we would get a bit worse results than if we didn't use <tt>tc</tt> at all. We have basically created two outgoing sub-channels of different priorities. The higher priority class ''can'' drown out the lower-priority class, and this is intentional so it isn't the issue -- in this case we ''want'' that functionality. The problem is that the high priority and low priority classes can both be dominated by high-bandwidth flows, causing other traffic flows of the same priority to be drowned out. To fix this, two <tt>sfq</tt> queuing disciplines are added to the high and low priority classes and will ensure that individual traffic flows are identified and each given a fair shot at sending data out of their respective classes. This should prevent starvation within the classes themselves.


We have started on fixing it, but we need to finish the async input
=== <tt>iptables</tt> code walkthrough ===
rewrite on trunk, but all of the people who have hacked on it, myself
included have hit ENOTIME for the last several years.


Hopefully the publicity this has generated will get renewed interest
First note that we are adding netfilter rules to the <tt>POSTROUTING</tt> chain, in the <tt>mangle</tt> table. This table allows us to modify the packets ''right before'' they are queued to be sent out of an interface, which is exactly what we want. At this point, these packets could have been locally-generated or forwarded -- as long as they are on their way to going out of <tt>modemif</tt> (eth4 in this case), the <tt>mangle</tt> <tt>POSTROUTING</tt> chain will see them and we can classify them and perform other useful tweaks.
in solving this problem the right way, once and for all :)
</pre>
There are a number of efforts to either modify Apache or add an additional Apache module to deal effectively with Slowloris. We have looked at one of these modifications so far.


=== Anti-slowloris.diff ===
The iptables code puts all traffic with the "minimize-delay" flag (interactive ssh traffic, for example) in the high priority traffic class. In addition, all HTTP, HTTPS and DNS TCP traffic will be classified as high-priority. Remember that all UDP traffic is being classified as high priority via the <tt>tc</tt> rule described above, so this will take care of DNS UDP traffic automatically.
Andreas Krennmair posted a patch to the apache-dev list on June 21, 2009 called [http://thread.gmane.org/gmane.comp.apache.devel/37773 "anti-slowloris.diff."] This patch applies to only the prefork MPM and is a basic proof of concept of how Apache can be more resilient towards Slowloris attacks.


In our testing, we've found this patch to not be fully effective. While it does allow Apache to survive basic Slowloris attacks, Apache still goes offline when a more powerful Slowloris attack is used.
=== Further optimizations ===


=== Other Options ===
==== SSH ====
There are other options, such as [http://thread.gmane.org/gmane.comp.apache.devel/37773 mod_noloris], which are being developed and may be incorporated into Apache. We have not yet had time to test mod_noloris to determine whether it's effective.


=== Summary ===
<source lang="bash">
We have not yet found a defense against Slowloris that can be integrated directly into Apache.
iptables -t mangle -N tosfix
iptables -t mangle -A tosfix -p tcp -m length --length 0:512 -j RETURN
#allow screen redraws under interactive SSH sessions to be fast:
iptables -t mangle -A tosfix -m hashlimit --hashlimit 20/sec --hashlimit-burst 20 \
--hashlimit-mode srcip,srcport,dstip,dstport --hashlimit-name minlat -j RETURN
iptables -t mangle -A tosfix -j TOS --set-tos Maximize-Throughput
iptables -t mangle -A tosfix -j RETURN


== NetFilter ==
iptables -t mangle -A POSTROUTING -p tcp -m tos --tos Minimize-Delay -j tosfix
</source>


Another option to protect against Slowloris is to utilize something like Linux's netfilter connlimit functionality (available in kernel 2.6.23+) in order to limit the rate of incoming connections coming from a particular host. This can be implemented for HTTP as follows:
To use this code, place it ''near the top of the file'', just below the <tt>modemif="eth4"</tt> line, but ''before'' the main <tt>iptables</tt> and <tt>tc</tt> rules. These rules will apply to ''all'' packets about to get queued to any interface, but this is not necessarily a bad thing, since the TCP flags being set are not just specific to our traffic control functionality. To make these rules specific to <tt>modemif</tt>, add "-o $modemif" after "-A POSTROUTING" on the last line, above. As-is, the rules above will set the TCP flags on all packets flowing out of all interfaces, but the the traffic control rules will only take effect for <tt>modemif</tt>, because they are only configured for that interface.
<pre>
iptables -A INPUT -p tcp --syn --dport 80 -m connlimit --connlimit-above 100 -j DROP
</pre>
While this is an effective defense against Slowloris, we do have a number of concerns about this approach.


First, the connlimit rate must be very low to protect Apache. This can result in legitimate requests being denied, and is probably not a workable solution for higher-traffic sites. It also presents problems for traffic originating from behind "mega-proxies," as these may easily hit the connection rate limiting cut-off.
SSH is a tricky protocol. By default, all the outgoing SSH traffic is classified as "minimize-delay" traffic, which will cause it to all flow into our high-priority class, even if it is a bulk <tt>scp</tt> transfer running in the background. This code will grab all "minimize-delay" traffic such as SSH and telnet and route it through some special rules. Any individual keystrokes (small packets) will be left as "minimize-delay" packets. For anything else, we will run the <tt>hashlimit</tt> iptables module, which will identify individual outbound flows and allow small bursts of traffic (even big packets) to remain "minimize-delay" packets. These settings have been specifically tuned so that most <tt>GNU screen</tt> screen changes (^A^N) when logging into your server(s) remotely will be fast. Any traffic over these burst limits will be reclassified as "maximize-throughput" and thus will drop to our lower-priority class 1:12. Combined with the traffic control rules, this will allow you to have very responsive SSH sessions into your servers, even if they are doing some kind of bulk outbound copy, like rsync over SSH.


Since this solution doesn't provide any kind of delayed binding functionality, is also possible to craft the Slowloris attack so that it utilizes multiple hosts, operating at a slower connection rate, and Apache would easily be taken offline. Therefore, we cannot recommend netfilter alone as a complete mitigation for Slowloris. It may work for some low-traffic sites, but does not provide robust protection against a motivated attacker.
Code in our main <tt>iptables</tt> rules will ensure that any "minimize-delay" traffic is tagged to be in the high-priority 1:10 class.


== Alternate Web Servers ==
What this does is keep interactive SSH and telnet keystrokes in the high-priority class, allow GNU screen full redraws and reasonable full-screen editor scrolling to remain in the high-priority class, while forcing bulk transfers into the lower-priority class.
For some organizations, switching to an alternative Web server that is not supposed to be vulnerable to Slowloris may be a viable option to protect against Slowloris. We utilized the [http://www.cherokee-project.com/ Cherokee Web server] for our testing.


=== Initial Cherokee Tests ===
==== ACKs ====
We configured Cherokee on Funtoo Linux, and found that it was resilient to Slowloris, when Slowloris was used with default settings. We were even able to ramp up Slowloris quite a bit and were still able to load pages from the Web server.


However, as we ramped up Slowloris further, we found that Cherokee began to refuse connections. Upon investigation, we discovered that Cherokee had exhausted its 1024 available file descriptors, limiting the server to 512 simultaneous connections. After raising the default ulimits on the system, Cherokee was able to withstand a very signficant Slowloris attack and still serve pages. However, several thousand file descriptors were consumed on the host system.
<source lang="bash">
iptables -t mangle -N ack
iptables -t mangle -A ack -m tos ! --tos Normal-Service -j RETURN
iptables -t mangle -A ack -p tcp -m length --length 0:128 -j TOS --set-tos Minimize-Delay
iptables -t mangle -A ack -p tcp -m length --length 128: -j TOS --set-tos Maximize-Throughput
iptables -t mangle -A ack -j RETURN


=== Configuring Cherokee for Slowloris ===
iptables -t mangle -A POSTROUTING -p tcp -m tcp --tcp-flags SYN,RST,ACK ACK -j ack
On Gentoo Linux or Funtoo Linux, you can configure Cherokee to be resilient to Slowloris by increasing the number of file descriptors available to the cherokee user. To do this, first raise the default number of file descriptors available to a user by adding the following lines to <span style="color:green">/etc/security/limits.conf</span>:
</source>
<pre>
*              soft    nofile          4096
*              hard    nofile          5028
</pre>
To apply this change to just the cherokee user, change the <span style="color:green">*</span>'s to <span style="color:green">cherokee</span>.


Next, you will need to configure Cherokee to utilize all these file descriptors by modifying <span style="color:green">/etc/cherokee/cherokee.conf</span> or utilizing <span style="color:green">cherokee-admin</span> (Advanced settings.) This can be done by adding the following line to <span style="color:green">/etc/cherokee/cherokee.conf</span> and restarting Cherokee:
To use this code, place it ''near the top of the file, just below the <tt>modemif="eth4"</tt> line, but ''before'' the main <tt>iptables</tt> and <tt>tc</tt> rules.
<pre>
server!fdlimit = 4096
</pre>
{{fancyimportant|While cherokee-admin states that Cherokee will automatically detect the number of available file descriptors, it didn't seem to do so in our testing.}}


== Conclusion ==
ACK optimization is another useful thing to do. If we prioritize small ACKs heading out to the modem, it will allow TCP traffic to flow more smoothly without unnecessary delay.  The lines above accomplish this.


We will continue to update this document as more information is available to us. If you are trying to protect against Slowloris, our current top recommendations are:
This code basically sets the "minimize-delay" flag on small ACKs. Code in our main <tt>iptables</tt> rules will then tag these packets so they enter high-priority traffic class 1:10.


*'''Test''' your infrastructure to determine its level of resiliency.
== Other Links of Interest ==
*'''Don't assume''' that your infrastructure is not vulnerable to Slowloris.
* http://manpages.ubuntu.com/manpages/maverick/en/man8/ufw.8.html
*'''Utilize multiple layers of defense''' when possible.
* https://help.ubuntu.com/community/UFW
*'''Find limits when testing'''. Don't just use default Slowloris settings.
 
In general, a hardware load balancer with SSL encryption offload when ''configured to perform delayed binding'' is a great first defense against Slowloris. This will prevent Apache from seeing the Slowloris traffic. Our one concern about just relying on a hardware load balancer is that it is only one layer of defense. However, one layer of defense may be an acceptable interim solution.
 
If you do not have a hardware load balancer that can be configured to protect against Slowloris, and you are in a position where you are able to switch Web servers, then our recommendation is to deploy an alternative HTTP server such as Cherokee, and configure it so that it can handle a large number of simultaneous connections.
 
Also consider combining Cherokee with Linux netfilter connection rate limiting (for kernels 2.6.23+) with a ''high threshold''. This should provide adequate defense against Slowloris - the Web server will have adequate resources to handle typical Slowloris attacks, and extreme attacks will hit the connection rate limit and be denied. Our one concern with this approach is that it will still result in signficant kernel resources being utilized, since no delayed binding is being performed. This can cause resource problems that could result in performance degradation, and could allow sophisticated multi-host attacks to be effective.
 
And if at all possible, use a combination of approaches, such as a properly- configured hardware load balancer combined with a resilient Web server. And please test your configuration to ensure that your defenses are working properly! :)


[[Category:Investigations]]
[[Category:Articles]]
[[Category:Articles]]
[[Category:Featured]]
[[Category:Networking]]
{{ArticleFooter}}
{{ArticleFooter}}

Latest revision as of 08:57, December 28, 2014

   Support Funtoo!
Get an awesome Funtoo container and support Funtoo! See Funtoo Containers for more information.

Introduction

Linux's traffic control functionality offers a lot of capabilities related to influencing the rate of flow, as well as latency, of primarily outgoing but also in some cases incoming network traffic. It is designed to be a "construction kit" rather than a turn-key system, where complex network traffic policing and shaping decisions can be made using a variety of algorithms. The Linux traffic control code is also often used by academia for research purposes, where is it can be a useful mechanism to simulate and explore the impact of a variety of different network behaviors. See netem for an example of a simulation framework that can be used for this purpose.

Of course, Linux traffic control can also be extremely useful in an IT context, and this document is intended to focus on the practical, useful applications of Linux traffic control, where these capabilities can be applied to solve problems that are often experienced on modern networks.

Incoming and Outgoing Traffic

One common use of Linux traffic control is to configure a Linux system as a Linux router or bridge, so that the Linux system sits between two networks, or between the "inside" of the network and the real router, so that it can shape traffic going to local machines as well as out to the Internet. This provides a way to prioritize, shape and police both incoming (from the Internet) and outgoing (from local machines) network traffic, because it is easiest to create traffic control rules for traffic flowing out of an interface, since we can control when the system sends data, but controlling when we receive data requires an additional intermediate queue to be created to buffer incoming data. When a Linux system is configured as a firewall or router with a physical interface for each part of the network, we can avoid using intermediate queues.

A simple way to set up a layer 2 bridge using Linux involves creating a bridge device with brctl, adding two Ethernet ports to this bridge (again using brctl), and then apply prioritization, shaping and policing rules to both interfaces. The rules will apply to outgoing traffic on each interface. One physical interface will be connected to an upstream router on the same network, while the other network port will be connected to a layer 2 access switch to which local machines are connected. This allows powerful egress shaping policies to be created on both interfaces, to control the flows in and out of the network.

Recommended Resources

Resources you should take a look at, in order:

Related Interesting Links:

Recommended Approaches

Daniel Robbins has had very good results with the HTB queuing discipline - it has very good features, and also has very good documentation, which is just as important, and is designed to deliver useful results in a production environment. And it works. If you use traffic control under Funtoo Linux, please use the HTB queuing discipline as the root queuing discipline because you will get good results in very little time. Avoid using any other queuing discipline under Funtoo Linux as the root queuing discipline on any interface. If you are creating a tree of classes and qdiscs, HTB should be at the top, and you should avoid hanging classes under any other qdisc unless you have plenty of time to experiment and verify that your QoS rules are working as expected. Please see State of the Code for more info on what Daniel Robbins considers to be the current state of the traffic control implementation in Linux.

State of the Code

If you are using enterprise kernels, especially any RHEL5-based kernels, you must be aware that the traffic control code in these kernels is about 5 years old and contains many significant bugs. In general, it is possible to avoid these bugs by using HTB as your root queueing discipline and testing things carefully to ensure that you are getting the proper behavior. The prio queueing discipline is known to not work reliably in RHEL5 kernels. See Broken Traffic Control for more information on known bugs with older kernels.

If you are using a more modern kernel, Linux traffic control should be fairly robust. The examples below should work with RHEL5 as well as newer kernels.

Inspect Your Rules

If you are implementing Linux traffic control, you should be running these commands frequently to monitor the behavior of your queuing discipline. Replace $wanif with the actual network interface name.

tc -s qdisc ls dev $wanif
tc -s class ls dev $wanif

Matching

Here are some examples you can use as the basis for your own filters/classifiers:

  1. protocol arp u32 match u32 0 0 - match ARP packets
  2. protocol ip u32 match ip protocol 0x11 0xff - match UDP packets
  3. protocol ip u32 match ip protocol 17 0xff - (also) match UDP packets
  4. protocol ip u32 match ip protocol 0x6 0xff - match TCP packets
  5. protocol ip u32 match ip protocol 1 0xff - match ICMP (ping) packets
  6. protocol ip u32 match ip dst 4.3.2.1/32 - match all IP traffic headed for IP 4.3.2.1
  7. protocol ip u32 match ip src 4.3.2.1/32 match ip sport 80 0xffff - match all IP traffic from 4.3.2.1 port 80
  8. protocol ip u32 match ip sport 53 0xffff - match originating DNS (both TCP and UDP)
  9. protocol ip u32 match ip dport 53 0xffff - match response DNS (both TCP and UDP)
  10. protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13 - match packets with ACK bit set
  11. protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13 match u16 0x0000 0xffc0 at 2 - packets less than 64 bytes in size with ACK bit set
  12. protocol ip u32 match ip tos 0x10 0xff - match IP packets with "type of service" set to "Minimize delay"/"Interactive"
  13. protocol ip u32 match ip tos 0x08 0xff - match IP packets with "type of service" set to "Maximize throughput"/"Bulk" (see "QDISC PARAMETERS" in tc-prio man page)
  14. protocol ip u32 match tcp dport 53 0xffff match ip protocol 0x6 0xff - match TCP packets heading for dest. port 53 (my not work)

Sample Traffic Control Code

modemif=eth4

iptables -t mangle -A POSTROUTING -o $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 53 -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10

tc qdisc add dev $modemif root handle 1: htb default 12
tc class add dev $modemif parent 1: classid 1:1 htb rate 1500kbit ceil 1500kbit burst 10k
tc class add dev $modemif parent 1:1 classid 1:10 htb rate 700kbit ceil 1500kbit prio 1 burst 10k
tc class add dev $modemif parent 1:1 classid 1:12 htb rate 800kbit ceil 800kbit prio 2
tc filter add dev $modemif protocol ip parent 1:0 prio 1 u32 match ip protocol 0x11 0xff flowid 1:10
tc qdisc add dev $modemif parent 1:10 handle 20: sfq perturb 10
tc qdisc add dev $modemif parent 1:12 handle 30: sfq perturb 10

The code above is a working traffic control script that is even compatible with RHEL5 kernels, for a 1500kbps outbound link (T1, Cable or similar.) In this example, eth4 is part of a bridge. The code above should work regardless of whether eth4 is in a bridge or not -- just make sure that modemif is set to the interface on which traffic is flowing out and you wish to apply traffic control.

tc code walkthrough

This script uses the tc command to create two priority classes - 1:10 and 1:12. By default, all traffic goes into the low-priority class, 1:12. 1:10 has priority over 1:12 (prio 1 vs. prio 2,) so if there is any traffic in 1:10 ready to be sent, it will be sent ahead of 1:12. 1:10 has a rate of 700kbit but can use up to the full outbound bandwidth of 1500kbit by borrowing from 1:12.

UDP traffic (traffic that matches ip protocol 0x11 0xff) will be put in the high priority class 1:10. This can be good for things like FPS games, to ensure that latency is low and not drowned out by lower-priority traffic.

If we stopped here, however, we would get a bit worse results than if we didn't use tc at all. We have basically created two outgoing sub-channels of different priorities. The higher priority class can drown out the lower-priority class, and this is intentional so it isn't the issue -- in this case we want that functionality. The problem is that the high priority and low priority classes can both be dominated by high-bandwidth flows, causing other traffic flows of the same priority to be drowned out. To fix this, two sfq queuing disciplines are added to the high and low priority classes and will ensure that individual traffic flows are identified and each given a fair shot at sending data out of their respective classes. This should prevent starvation within the classes themselves.

iptables code walkthrough

First note that we are adding netfilter rules to the POSTROUTING chain, in the mangle table. This table allows us to modify the packets right before they are queued to be sent out of an interface, which is exactly what we want. At this point, these packets could have been locally-generated or forwarded -- as long as they are on their way to going out of modemif (eth4 in this case), the mangle POSTROUTING chain will see them and we can classify them and perform other useful tweaks.

The iptables code puts all traffic with the "minimize-delay" flag (interactive ssh traffic, for example) in the high priority traffic class. In addition, all HTTP, HTTPS and DNS TCP traffic will be classified as high-priority. Remember that all UDP traffic is being classified as high priority via the tc rule described above, so this will take care of DNS UDP traffic automatically.

Further optimizations

SSH

iptables -t mangle -N tosfix
iptables -t mangle -A tosfix -p tcp -m length --length 0:512 -j RETURN
#allow screen redraws under interactive SSH sessions to be fast:
iptables -t mangle -A tosfix -m hashlimit --hashlimit 20/sec --hashlimit-burst 20 \
--hashlimit-mode srcip,srcport,dstip,dstport --hashlimit-name minlat -j RETURN
iptables -t mangle -A tosfix -j TOS --set-tos Maximize-Throughput
iptables -t mangle -A tosfix -j RETURN

iptables -t mangle -A POSTROUTING -p tcp -m tos --tos Minimize-Delay -j tosfix

To use this code, place it near the top of the file, just below the modemif="eth4" line, but before the main iptables and tc rules. These rules will apply to all packets about to get queued to any interface, but this is not necessarily a bad thing, since the TCP flags being set are not just specific to our traffic control functionality. To make these rules specific to modemif, add "-o $modemif" after "-A POSTROUTING" on the last line, above. As-is, the rules above will set the TCP flags on all packets flowing out of all interfaces, but the the traffic control rules will only take effect for modemif, because they are only configured for that interface.

SSH is a tricky protocol. By default, all the outgoing SSH traffic is classified as "minimize-delay" traffic, which will cause it to all flow into our high-priority class, even if it is a bulk scp transfer running in the background. This code will grab all "minimize-delay" traffic such as SSH and telnet and route it through some special rules. Any individual keystrokes (small packets) will be left as "minimize-delay" packets. For anything else, we will run the hashlimit iptables module, which will identify individual outbound flows and allow small bursts of traffic (even big packets) to remain "minimize-delay" packets. These settings have been specifically tuned so that most GNU screen screen changes (^A^N) when logging into your server(s) remotely will be fast. Any traffic over these burst limits will be reclassified as "maximize-throughput" and thus will drop to our lower-priority class 1:12. Combined with the traffic control rules, this will allow you to have very responsive SSH sessions into your servers, even if they are doing some kind of bulk outbound copy, like rsync over SSH.

Code in our main iptables rules will ensure that any "minimize-delay" traffic is tagged to be in the high-priority 1:10 class.

What this does is keep interactive SSH and telnet keystrokes in the high-priority class, allow GNU screen full redraws and reasonable full-screen editor scrolling to remain in the high-priority class, while forcing bulk transfers into the lower-priority class.

ACKs

iptables -t mangle -N ack
iptables -t mangle -A ack -m tos ! --tos Normal-Service -j RETURN
iptables -t mangle -A ack -p tcp -m length --length 0:128 -j TOS --set-tos Minimize-Delay
iptables -t mangle -A ack -p tcp -m length --length 128: -j TOS --set-tos Maximize-Throughput
iptables -t mangle -A ack -j RETURN

iptables -t mangle -A POSTROUTING -p tcp -m tcp --tcp-flags SYN,RST,ACK ACK -j ack

To use this code, place it near the top of the file, just below the modemif="eth4" line, but before the main iptables and tc rules.

ACK optimization is another useful thing to do. If we prioritize small ACKs heading out to the modem, it will allow TCP traffic to flow more smoothly without unnecessary delay. The lines above accomplish this.

This code basically sets the "minimize-delay" flag on small ACKs. Code in our main iptables rules will then tag these packets so they enter high-priority traffic class 1:10.

Other Links of Interest


   Note

Browse all our available articles below. Use the search field to search for topics and keywords in real-time.

Article Subtitle
Article Subtitle
Awk by Example, Part 1 An intro to the great language with the strange name
Awk by Example, Part 2 Records, loops, and arrays
Awk by Example, Part 3 String functions and ... checkbooks?
Bash by Example, Part 1 Fundamental programming in the Bourne again shell (bash)
Bash by Example, Part 2 More bash programming fundamentals
Bash by Example, Part 3 Exploring the ebuild system
BTRFS Fun
Funtoo Filesystem Guide, Part 1 Journaling and ReiserFS
Funtoo Filesystem Guide, Part 2 Using ReiserFS and Linux
Funtoo Filesystem Guide, Part 3 Tmpfs and Bind Mounts
Funtoo Filesystem Guide, Part 4 Introducing Ext3
Funtoo Filesystem Guide, Part 5 Ext3 in Action
GUID Booting Guide
Learning Linux LVM, Part 1 Storage management magic with Logical Volume Management
Learning Linux LVM, Part 2 The cvs.gentoo.org upgrade
Libvirt
Linux Fundamentals, Part 1
Linux Fundamentals, Part 2
Linux Fundamentals, Part 3
Linux Fundamentals, Part 4
LVM Fun
Making the Distribution, Part 1
Making the Distribution, Part 2
Making the Distribution, Part 3
Maximum Swappage Getting the most out of swap
On screen annotation Write on top of apps on your screen
OpenSSH Key Management, Part 1 Understanding RSA/DSA Authentication
OpenSSH Key Management, Part 2 Introducing ssh-agent and keychain
OpenSSH Key Management, Part 3 Agent Forwarding
Partition Planning Tips Keeping things organized on disk
Partitioning in Action, Part 1 Moving /home
Partitioning in Action, Part 2 Consolidating data
POSIX Threads Explained, Part 1 A simple and nimble tool for memory sharing
POSIX Threads Explained, Part 2
POSIX Threads Explained, Part 3 Improve efficiency with condition variables
Sed by Example, Part 1
Sed by Example, Part 2
Sed by Example, Part 3
Successful booting with UUID Guide to use UUID for consistent booting.
The Gentoo.org Redesign, Part 1 A site reborn
The Gentoo.org Redesign, Part 2 The Documentation System
The Gentoo.org Redesign, Part 3 The New Main Pages
The Gentoo.org Redesign, Part 4 The Final Touch of XML
Traffic Control
Windows 10 Virtualization with KVM