Difference between revisions of "Traffic Control"

From Funtoo Linux
Jump to: navigation, search
(State of the Code)
Line 13: Line 13:
 
Resources you should take a look at, in order:
 
Resources you should take a look at, in order:
  
* [http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm HTB documentation] by Martin Devera
+
* [http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm HTB documentation] by Martin Devera. Best way to create different priority classes and bandwidth allocations.
* [http://www.opalsoft.net/qos/DS.htm Differentiated Services On Linux HOWTO] by Leonardo Balliache
+
* [http://www.opalsoft.net/qos/DS.htm Differentiated Services On Linux HOWTO] by Leonardo Balliache. Good general docs.
* [http://blog.edseek.com/~jasonb/articles/traffic_shaping/index.html A Practical Guide to Linux Traffic Control] by Jason Boxman
+
* [http://blog.edseek.com/~jasonb/articles/traffic_shaping/index.html A Practical Guide to Linux Traffic Control] by Jason Boxman. Good general docs.
* [http://www.linuxfoundation.org/collaborate/workgroups/networking/ifb IFB - replacement for Linux IMQ, with examples]
+
* [http://www.linuxfoundation.org/collaborate/workgroups/networking/ifb IFB - replacement for Linux IMQ], with examples. This is the official best way to do ''inbound'' traffic control.
* [http://seclists.org/fulldisclosure/2006/Feb/702 Use of iptables hashlimit]
+
* [http://seclists.org/fulldisclosure/2006/Feb/702 Use of iptables hashlimit] - Great functionality in iptables.
  
 
Related Interesting Links:
 
Related Interesting Links:
  
* [http://wiki.secondlife.com/wiki/BLT Second Life Bandwidth Testing Protocol] (using Netem)
+
* [http://wiki.secondlife.com/wiki/BLT Second Life Bandwidth Testing Protocol] - example of Netem
 
* [http://www.29west.com/docs/THPM/udp-buffer-sizing.html UDP Buffer Sizing], part of [http://www.29west.com/docs/THPM/index.html Topics in High Performance Messaging]
 
* [http://www.29west.com/docs/THPM/udp-buffer-sizing.html UDP Buffer Sizing], part of [http://www.29west.com/docs/THPM/index.html Topics in High Performance Messaging]
  
Line 61: Line 61:
 
# <tt>protocol ip u32 match ip tos 0x08 0xff</tt> - match IP packets with "type of service" set to "Maximize throughput"/"Bulk" (see "QDISC PARAMETERS" in <tt>tc-prio</tt> man page)
 
# <tt>protocol ip u32 match ip tos 0x08 0xff</tt> - match IP packets with "type of service" set to "Maximize throughput"/"Bulk" (see "QDISC PARAMETERS" in <tt>tc-prio</tt> man page)
 
# <tt>protocol ip u32 match tcp dport 53 0xffff match ip protocol 0x6 0xff</tt> - match TCP packets heading for dest. port 53 (my not work)
 
# <tt>protocol ip u32 match tcp dport 53 0xffff match ip protocol 0x6 0xff</tt> - match TCP packets heading for dest. port 53 (my not work)
 +
 +
== Sample Traffic Control Code ==
 +
 +
<pre>
 +
modemif=eth4
 +
 +
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 53 -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
 +
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10
 +
 +
tc qdisc add dev $modemif root handle 1: htb default 12
 +
tc class add dev $modemif parent 1: classid 1:1 htb rate 1500kbit ceil 1500kbit burst 10k
 +
tc class add dev $modemif parent 1:1 classid 1:10 htb rate 700kbit ceil 1500kbit prio 1 burst 10k
 +
tc class add dev $modemif parent 1:1 classid 1:12 htb rate 800kbit ceil 800kbit prio 2
 +
tc filter add dev $modemif protocol ip parent 1:0 prio 1 u32 match ip protocol 0x11 0xff flowid 1:10
 +
tc qdisc add dev $modemif parent 1:10 handle 20: sfq perturb 10
 +
tc qdisc add dev $modemif parent 1:12 handle 30: sfq perturb 10
 +
</pre>
 +
 +
The code above is a working traffic control script that is even compatible with RHEL5 kernels, for a 1500kbps outbound link (T1, Cable or similar.) In this example, <tt>eth4</tt> is part of a bridge. If you are not using a bridge, change the <tt>iptables</tt> rules and remove <tt>-o brwan -m physdev --physdev-out $modemif</tt> and simply replace with <tt>-o $modemif</tt>.
 +
 +
=== tc code walkthrough ===
 +
 +
This script uses the <tt>tc</tt> command to create two priority classes - 1:10 and 1:12. 1:10 has priority over 1:12 (<tt>prio 1</tt> vs. <tt>prio 2</tt>,) so if there is any traffic in 1:10 ready to be sent, it will be sent ahead of 1:12. 1:10 has a rate of 700kbit but can use up to the full outbound bandwidth of 1500kbit by borrowing from 1:12.
 +
 +
UDP traffic (traffic that matches <tt>ip protocol 0x11 0xff</tt>) will be put in the high priority class 1:10. This can be good for things like FPS games, to ensure that latency is low and not drowned out by lower-priority traffic.
 +
 +
If we stopped here, however, we would get a bit worse results than if we didn't use <tt>tc</tt> at all. We have basically created two outgoing sub-channels of different priorities. The higher priority class ''can'' drown out the lower-priority class, and this is intentional so it isn't the problem. The problem is that the high priority and low priority classes can both be dominated by high-bandwidth flows, causing other traffic flows of the same priority to be drowned out. To fix this, two <tt>sfq</tt> queuing disciplines are added to the high and low priority classes and will ensure that traffic flows are identified and each given a fair shot at sending data out of the link.
  
 
== Other Links of Interest ==
 
== Other Links of Interest ==

Revision as of 05:21, 17 February 2011

Contents

Introduction

Linux's traffic control functionality offers a lot of capabilities related to influencing the rate of flow, as well as latency, of primarily outgoing but also in some cases incoming network traffic. It is designed to be a "construction kit" rather than a turn-key system, where complex network traffic policing and shaping decisions can be made using a variety of algorithms. The Linux traffic control code is also often used by academia for research purposes, where is it can be a useful mechanism to simulate and explore the impact of a variety of different network behaviors. See netem for an example of a simulation framework that can be used for this purpose.

Of course, Linux traffic control can also be extremely useful in an IT context, and this document is intended to focus on the practical, useful applications of Linux traffic control, where these capabilities can be applied to solve problems that are often experienced on modern networks.

Incoming and Outgoing Traffic

One common use of Linux traffic control is to configure a Linux system as a Linux router or bridge, so that the Linux system sits between two networks, or between the "inside" of the network and the real router, so that it can shape traffic going to local machines as well as out to the Internet. This provides a mechanism to prioritize, shape and police both incoming (from the Internet) and outgoing (from local machines) network traffic. The simplest configuration using this approach is to create a bridge device with brctl, add two Ethernet ports to this bridge (again using brctl), and then apply prioritization, shaping and policing rules to both interfaces. One physical interface will be connected to an upstream router on the same network, while the other network port will be connected to a layer 2 access switch to which local machines are connected. This allows powerful egress shaping policies to be created on both interfaces, to control the flows in and out of the network.

Recommended Resources

Resources you should take a look at, in order:

Related Interesting Links:

Recommended Approaches

Daniel Robbins has had very good results with the HTB queuing discipline - it has very good features, and also has very good documentation, which is just as important, and is designed to deliver useful results in a production environment. And it works. If you use traffic control under Funtoo Linux, please use the HTB queuing discipline as the root queuing discipline because you will get good results in very little time. Avoid using any other queuing discipline under Funtoo Linux as the root queuing discipline on any interface. If you are creating a tree of classes and qdiscs, HTB should be at the top, and you should avoid hanging classes under any other qdisc unless you have plenty of time to experiment and verify that your QoS rules are working as expected. Please see State of the Code for more info on what Daniel Robbins considers to be the current state of the traffic control implementation in Linux.

State of the Code

If you are using enterprise kernels, especially any RHEL5-based kernels, you must be aware that the traffic control code in these kernels is about 5 years old and contains many significant bugs. In general, it is possible to avoid these bugs by using HTB as your root queueing discipline and testing things carefully to ensure that you are getting the proper behavior. The prio queueing discipline is known to not work reliably. See Broken Traffic Control for more information on known bugs with older kernels.

If you are using a more modern kernel, Linux traffic control should be fairly robust.

Inspect Your Rules

If you are implementing Linux traffic control, you should be running these commands frequently to monitor the behavior of your queuing discipline. Replace $wanif with the actual network interface name.

tc -s qdisc ls dev $wanif
tc -s class ls dev $wanif

Matching

Here are some examples you can use as the basis for your own filters/classifiers:

  1. protocol arp u32 match u32 0 0 - match ARP packets
  2. protocol ip u32 match ip protocol 0x11 0xff - match UDP packets
  3. protocol ip u32 match ip protocol 17 0xff - (also) match UDP packets
  4. protocol ip u32 match ip protocol 0x6 0xff - match TCP packets
  5. protocol ip u32 match ip protocol 1 0xff - match ICMP (ping) packets
  6. protocol ip u32 match ip dst 4.3.2.1/32 - match all IP traffic headed for IP 4.3.2.1
  7. protocol ip u32 match ip src 4.3.2.1/32 match ip sport 80 0xffff - match all IP traffic from 4.3.2.1 port 80
  8. protocol ip u32 match ip sport 53 0xffff - match originating DNS (both TCP and UDP)
  9. protocol ip u32 match ip dport 53 0xffff - match response DNS (both TCP and UDP)
  10. protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13 - match packets with ACK bit set
  11. protocol ip u32 match ip protocol 6 0xff match u8 0x10 0xff at nexthdr+13 match u16 0x0000 0xffc0 at 2 - packets less than 64 bytes in size with ACK bit set
  12. protocol ip u32 match ip tos 0x10 0xff - match IP packets with "type of service" set to "Minimize delay"/"Interactive"
  13. protocol ip u32 match ip tos 0x08 0xff - match IP packets with "type of service" set to "Maximize throughput"/"Bulk" (see "QDISC PARAMETERS" in tc-prio man page)
  14. protocol ip u32 match tcp dport 53 0xffff match ip protocol 0x6 0xff - match TCP packets heading for dest. port 53 (my not work)

Sample Traffic Control Code

modemif=eth4

iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
iptables -t mangle -A POSTROUTING -o brwan -m physdev --physdev-out $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 53 -j CLASSIFY --set-class 1:10
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10
iptables -t mangle -A FORWARD -o brwan -m physdev --physdev-out $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10

tc qdisc add dev $modemif root handle 1: htb default 12
tc class add dev $modemif parent 1: classid 1:1 htb rate 1500kbit ceil 1500kbit burst 10k
tc class add dev $modemif parent 1:1 classid 1:10 htb rate 700kbit ceil 1500kbit prio 1 burst 10k
tc class add dev $modemif parent 1:1 classid 1:12 htb rate 800kbit ceil 800kbit prio 2
tc filter add dev $modemif protocol ip parent 1:0 prio 1 u32 match ip protocol 0x11 0xff flowid 1:10
tc qdisc add dev $modemif parent 1:10 handle 20: sfq perturb 10
tc qdisc add dev $modemif parent 1:12 handle 30: sfq perturb 10

The code above is a working traffic control script that is even compatible with RHEL5 kernels, for a 1500kbps outbound link (T1, Cable or similar.) In this example, eth4 is part of a bridge. If you are not using a bridge, change the iptables rules and remove -o brwan -m physdev --physdev-out $modemif and simply replace with -o $modemif.

tc code walkthrough

This script uses the tc command to create two priority classes - 1:10 and 1:12. 1:10 has priority over 1:12 (prio 1 vs. prio 2,) so if there is any traffic in 1:10 ready to be sent, it will be sent ahead of 1:12. 1:10 has a rate of 700kbit but can use up to the full outbound bandwidth of 1500kbit by borrowing from 1:12.

UDP traffic (traffic that matches ip protocol 0x11 0xff) will be put in the high priority class 1:10. This can be good for things like FPS games, to ensure that latency is low and not drowned out by lower-priority traffic.

If we stopped here, however, we would get a bit worse results than if we didn't use tc at all. We have basically created two outgoing sub-channels of different priorities. The higher priority class can drown out the lower-priority class, and this is intentional so it isn't the problem. The problem is that the high priority and low priority classes can both be dominated by high-bandwidth flows, causing other traffic flows of the same priority to be drowned out. To fix this, two sfq queuing disciplines are added to the high and low priority classes and will ensure that traffic flows are identified and each given a fair shot at sending data out of the link.

Other Links of Interest

Personal tools
Namespaces

Variants
Actions
Categories
Toolbox
Stuff