Difference between pages "Package:Squid" and "IPv4 calculations"

From Funtoo
(Difference between pages)
Jump to: navigation, search
 
(The Internet layer)
 
Line 1: Line 1:
== The Squid Proxy Server ==
+
WARNING: Work in progress. Do not edit this article unless you are the original author.
  
'''This is a quick and dirty howto about getting Squid up und running in 5min...'''
 
  
What benefits one may get from using an anonymous proxy server? Well, I would say many things but the most important one is that you can browse the web anonymously without exposing your IP, location etc.. out there. Anyhow, even though I usually use OpenVPN or PPTP for safe browsing and such things, having a private anonymous proxy server in your toolbox is a nice thing.
+
= Refresh on TCP/IP model =
Furthermore, a cache is speeding up you daily internet connection with repeating objects getting out of the cache instead of downloading it again. Advanced filtering technics (Antivirus, Content, Ad-Blocks, etc) are also possible.
+
  
Please start always by refreshing your portage tree, like:
+
When the ARPANet (a packet oriented network) was born in those good old seventies, engineers had to solve the problem of making computers being able to exchange packets of information over the network and they invented in 1974 something you are now using to view this page: TCP/IP! TCP/IP is a collection of various network protocols, being organized as a stack. Just like your boss does not do everything in the company and delegates at lower levels which in turn delegates at an even more lower level, no protocol in the TCP/IP suite takes all responsibilities, they are working together in a hierarchical and cooperative manner.  A level of the TCP/IP stack knows what its immediate lower subordinate can do for it and whatever it will do will be done the right way and will not worry about the manner the job will be done.  Also the only problem for a given level of the stack is to fulfill its own duties and deliver the service requested  by the upper layer, it does not have to worry about the ultimate goal of what upper levels do.
 +
 
 +
<illustration goes here TCP/IP model>
  
<console>
+
The above illustration sounds horribly familiar : yes, it is sounds like this good old OSI model. Indeed it is a tailored view of the original OSI model and it works the exact same way: so the data sent by an application A1 (residing on computer C1) to another application A2 (residing on computer C2) goes through C1's TCP/IP stack (from top to bottom), reach the C1's lower layers that will take the responsibility to move the bits from C1 to C2 over a physical link (electrical or lights pulses, radio waves...sorry no quantum mechanism yet) . C2's lower layers will receive the bits sent by C1 and pass  what has been received to the C2's TCP/IP  stack (bottom to top) which will pass the data to A2. If C1 and C2 are not on the same network the process is a bit more complex because it involves relays (routers) but the global idea remains the same. Also there is no shortcuts in the process : both TCP/IP stacks are crossed in their whole, either from top to bottom for the sender or  bottom to top for the receiver. The transportation process in itself is also absolutely transparent from an application's point of view:  A1 knows it can rely on the TCP/IP stack to transmits some data to A2, ''how'' the data is transmitted is not its problem, A1 just assumes the data can be transmitted by some means. The TCP/IP stack is also loosely coupled to a particular network technology because its frontier is precisely the physical transportation of bits over a medium and so the physical network's technology,  just the same way A1 does not care about how the TCP/IP stack will move the data from one computer to another. The TCP/IP stack itself does not care about the details about how the bits are physically moved and thus it can work with any network technology no matter the technology is Ethernet, Token Ring or FDDI for example.
###i## emerge --sync
+
</console>
+
next, we search the portage tree for {{Package|net-proxy/squid}}:
+
<console>
+
###i## emerge --search squid
+
=> net-analyzer/squid-graph
+
=> net-analyzer/squidsites
+
=> net-analyzer/squidview
+
=> net-proxy/squid
+
=> net-proxy/squidclamav
+
=> net-proxy/squidguard
+
=> sec-policy/selinux-squid
+
</console>
+
  
Next, we emerge ''<code>squid</code>'' using:
+
= The Internet layer =
<console>
+
###i## emerge -av net-proxy/squid
+
</console>
+
  
Once it got installed, since this squid proxy setup will be using authentication to authenticate users via the ‘ncsa_auth‘ helper, we need to know the location of this helper so we can use it in our squid.confconfiguration file. To find this I’ll be using a tool named as ‘qfile‘ which is shipped in ‘app-portage/portage-utils‘.
+
The goal of this article being more focused on calculation of addresses used at the ''Internet layer'' so  let's forget the gory details of the TCP/IP stack works (you can find an extremely detailed discussion in [[How the TCP/IP stack works]]...  to be written...). From here, we assume you have a good general understanding of its functionalities and how a network transmission works. As you know the ''Internet'' layer is responsible to handle logical addressing issues of a TCP segment (or UDP datagram) that has either to be transmitted over the network to a remote computer or that has been received from the network from a remote computer. That layer is governed by a set of strict set rules called the ''Internet Protocol'' or ''IP'' originally specified by [RFC 791] in september 1981. What is pretty amazing with IP is that, although its original RFC has been amended by several others since 1981, its specification remains absolutely valid! If have a look at [RFC 791] you won't see "obsoleted". Sure IPv4 reached its limits in this first half the XXIst century but will remains in the IT landscape for probably several years to not say decades (you know, the COBOL language....). To finish on historical details, you might find interesting know that TCP/IP was not the original protocol suite used on the ARAPANet, it superseded in 1983 another protocol suite the [http://en.wikipedia.org/wiki/Network_Control_Program Network Control Program]. NCP looks like, from our point of view, quite prehistoric but it is of big importance as it established a lot of concepts still in use today : PDU, splitting an address in various components, connection management and so on comes from NCP. Historical reward  for those who are still reading this long paragraph: first, even a computer user was addressable in NCP messages second even in 1970 the engineers were concerned by network congestions issues ([http://www.cs.utexas.edu/users/chris/think/ARPANET/Timeline this page]).
  
# qfile ncsa_auth
+
Let's go back to those good old seventies: the engineers who designed the Internet Protocol retained a 32 bits addressing scheme for IP and, afterall, the ARAPnet will never have the need to be able to address  billions of hosts! If you look at some ARAPANet diagrams it counted less than 100 hosts in
net-proxy/squid (/usr/libexec/squid/ncsa_auth)
+
  
ok, so the auth helper is located in ‘/usr/libexec/squid/ncsa_auth’ so let’s setup Squid’s configuration file (/etc/squid/squid.conf). Make sure you change ‘XXX.XX.XX.XXX’ with your actual server’s IP address and edit anything else you want to suit your needs.
+
who would ''ever'' need millions of addresses afterall?  So in theory with those 32 bits we can have around 4 billions of computers within that network and arbitrarily retain that the very first connected computer must be given the number "0", the second one "1", the third one "2" and so on until we exhaust the address pool at number 4294967295 giving no more than 4294967296 (2^32) computers on that network because no number can be a duplicate.
  
 +
= Classful and classless networks =
  
<pre># cp /etc/squid/squid.conf{,_orig} && \cat > /etc/squid/squid.conf <<EOF
+
Those addresses follows the thereafter logic:
auth_param basic program /usr/libexec/squid/ncsa_auth /etc/squid/passwd
+
auth_param basic children 5
+
auth_param basic realm please login?
+
auth_param basic credentialsttl 2 hours
+
auth_param basic casesensitive off
+
acl ncsa_users proxy_auth REQUIRED
+
http_access allow ncsa_users
+
acl manager proto cache_object
+
acl localhost src 127.0.0.1/32 ::1
+
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1
+
acl localnet src 10.0.0.0/8   
+
# RFC 1918 possible internal network
+
acl localnet src 172.16.0.0/12 
+
# RFC 1918 possible internal network
+
acl localnet src 192.168.0.0/16
+
# RFC 1918 possible internal network
+
acl localnet src fc00::/7     
+
# RFC 4193 local private network range
+
acl localnet src fe80::/10     
+
# RFC 4291 link-local (directly plugged) machines
+
acl SSL_ports port 443
+
acl Safe_ports port 80          # http
+
acl Safe_ports port 21          # ftp
+
acl Safe_ports port 443        # https
+
acl Safe_ports port 70          # gopher
+
acl Safe_ports port 210        # wais
+
acl Safe_ports port 1025-65535  # unregistered ports
+
acl Safe_ports port 280        # http-mgmt
+
acl Safe_ports port 488        # gss-http
+
acl Safe_ports port 591        # filemaker
+
acl Safe_ports port 777        # multiling http
+
acl Safe_ports port 901        # SWAT
+
acl CONNECT method CONNECT
+
http_access allow manager localhost
+
http_access deny manager
+
http_access deny !Safe_ports
+
http_access deny CONNECT !SSL_ports
+
http_access allow localnet
+
http_access allow localhost
+
http_access allow localhost
+
http_access deny all
+
http_port 2222
+
coredump_dir /var/cache/squid
+
refresh_pattern ^ftp:          1440    20%    10080
+
refresh_pattern ^gopher:        1440    0%      1440
+
refresh_pattern -i (/cgi-bin/|\?) 0    0%      0
+
refresh_pattern .              0      20%    4320
+
icp_access allow localnet
+
icp_access deny all
+
acl ip1 myip XXX.XX.XX.XXX
+
tcp_outgoing_address XXX.XX.XX.XXX ip1
+
cache_mgr mail@maiwald.tk
+
cache_mem 128 MB
+
visible_hostname ViruSzZ
+
maximum_object_size 20 MB
+
cache_dir ufs /var/cache/squid 512 32 512
+
  
forwarded_for off
+
{| class="wikitable"
request_header_access Allow allow all
+
|-
request_header_access Authorization allow all
+
| colspan="2" | '''32 bits (fixed length)'''
request_header_access WWW-Authenticate allow all
+
|-
request_header_access Proxy-Authorization allow all
+
| '''Network''' part (variable length of N bits ) || '''Host''' part (length : 32 - N bits)
request_header_access Proxy-Authenticate allow all
+
|}
request_header_access Cache-Control allow all
+
request_header_access Content-Encoding allow all
+
request_header_access Content-Length allow all
+
request_header_access Content-Type allow all
+
request_header_access Date allow all
+
request_header_access Expires allow all
+
request_header_access Host allow all
+
request_header_access If-Modified-Since allow all
+
request_header_access Last-Modified allow all
+
request_header_access Location allow all
+
request_header_access Pragma allow all
+
request_header_access Accept allow all
+
request_header_access Accept-Charset allow all
+
request_header_access Accept-Encoding allow all
+
request_header_access Accept-Language allow all
+
request_header_access Content-Language allow all
+
request_header_access Mime-Version allow all
+
request_header_access Retry-After allow all
+
request_header_access Title allow all
+
request_header_access Connection allow all
+
request_header_access Proxy-Connection allow all
+
request_header_access User-Agent allow all
+
request_header_access Cookie allow all
+
request_header_access All deny all
+
shutdown_lifetime 3 seconds
+
EOF
+
</pre>
+
  
proceed with creating the ‘/etc/squid/passwd’ file and adding your user by executing:
+
* The network address : this part is uniquely assigned amongst all of the organizations in the world (i.e. No one in the world can hold the same network part)
# htpasswd -c /etc/squid/passwd your_user
+
* The host address :  unique within a given network part
(note that you need to omit the ‘-c’ switch when adding another user to the file)
+
  
then do a <code># squid -z</code> to create the cache direcory.
+
So in theory we can have something like this (remember the network nature is not to be unique, it hs to be be a collection of networks  :
Finally, restart your squid server and check if it’s actually listening using:
+
# /etc/init.d/squid restart
+
# netstat -tunlp | grep 2222
+
tcp        0      0 0.0.0.0:2222            0.0.0.0:*              LISTEN      482/(squid)
+
if you like it to start on your system’s start-up, then you can execute:
+
# rc-update add squid default
+
To test it, for example I use Opera for this so I just go to ‘Settings → Preferences → Advanced → Network → Proxy Servers’ and set the browser to use the proxy server we just created.
+
  
 +
* Network 1 Host 1
 +
*
  
[[Category:HOWTO]]
+
 
 +
Just like your birthday cake is divided in more or less smaller parts depending on how guests' appetite, the IPv4 address space has also been divided into more or less smaller parts just because organizations needs more or less computers on their networks. How to make this possible? Simply by dedicating a variable number of bits to the network part! Do you see the consequence? An IPv4 address being '''always''' 32 bits wide, the more bits you dedicate to the network part the lesser you have for the host part and vice-versa, this is a tradeoff, always. Basically, having more bits in :
 +
* the network part : means more networks possible at the cost of having less hosts per network 
 +
* the host part : means less networks but more hosts per network
 +
 
 +
It might sounds a bit abstract but let's take an example : imagine we dedicate only 8 bits for the network part and the remaining 24 for the hosts part. What happens?  First if we only
 +
 
 +
 +
 
 +
Is the network part assigned by each organization to itself? Of course not! Assignment are coordinated at the worldwide level by what we call Regional Internet Registries or RIRs which, in turn, can delegate assignments to third-parties located within their geographic jurisdiction. Those latter are called Local Internet Registries or LIRs (the system is detailed in RFC 7020). All of those RIRs are themselves put under the responsibility of now now well-known Internet Assigned Numbers Authority or [http://www.iana.org IANA]. As of 2014 five RIR exists :
 +
 +
* ARIN (American Registry for Internet Numbers) : covers North America
 +
* LACNIC (Latin America and Caribbean Network Information Centre): covers South America and the Caribbean
 +
* RIPE-NCC (Réseaux IP Européens / or RIPE Network Coordination Centre): covers Europe, Russia and middle east
 +
* Afrinic (Africa Network Information Center) : covers the whole Africa
 +
* APNIC (Asian and Pacific Network Information Centre) : covers oceania and far east.

Revision as of 17:52, 16 January 2014

WARNING: Work in progress. Do not edit this article unless you are the original author.


Refresh on TCP/IP model

When the ARPANet (a packet oriented network) was born in those good old seventies, engineers had to solve the problem of making computers being able to exchange packets of information over the network and they invented in 1974 something you are now using to view this page: TCP/IP! TCP/IP is a collection of various network protocols, being organized as a stack. Just like your boss does not do everything in the company and delegates at lower levels which in turn delegates at an even more lower level, no protocol in the TCP/IP suite takes all responsibilities, they are working together in a hierarchical and cooperative manner. A level of the TCP/IP stack knows what its immediate lower subordinate can do for it and whatever it will do will be done the right way and will not worry about the manner the job will be done. Also the only problem for a given level of the stack is to fulfill its own duties and deliver the service requested by the upper layer, it does not have to worry about the ultimate goal of what upper levels do.

<illustration goes here TCP/IP model>

The above illustration sounds horribly familiar : yes, it is sounds like this good old OSI model. Indeed it is a tailored view of the original OSI model and it works the exact same way: so the data sent by an application A1 (residing on computer C1) to another application A2 (residing on computer C2) goes through C1's TCP/IP stack (from top to bottom), reach the C1's lower layers that will take the responsibility to move the bits from C1 to C2 over a physical link (electrical or lights pulses, radio waves...sorry no quantum mechanism yet) . C2's lower layers will receive the bits sent by C1 and pass what has been received to the C2's TCP/IP stack (bottom to top) which will pass the data to A2. If C1 and C2 are not on the same network the process is a bit more complex because it involves relays (routers) but the global idea remains the same. Also there is no shortcuts in the process : both TCP/IP stacks are crossed in their whole, either from top to bottom for the sender or bottom to top for the receiver. The transportation process in itself is also absolutely transparent from an application's point of view: A1 knows it can rely on the TCP/IP stack to transmits some data to A2, how the data is transmitted is not its problem, A1 just assumes the data can be transmitted by some means. The TCP/IP stack is also loosely coupled to a particular network technology because its frontier is precisely the physical transportation of bits over a medium and so the physical network's technology, just the same way A1 does not care about how the TCP/IP stack will move the data from one computer to another. The TCP/IP stack itself does not care about the details about how the bits are physically moved and thus it can work with any network technology no matter the technology is Ethernet, Token Ring or FDDI for example.

The Internet layer

The goal of this article being more focused on calculation of addresses used at the Internet layer so let's forget the gory details of the TCP/IP stack works (you can find an extremely detailed discussion in How the TCP/IP stack works... to be written...). From here, we assume you have a good general understanding of its functionalities and how a network transmission works. As you know the Internet layer is responsible to handle logical addressing issues of a TCP segment (or UDP datagram) that has either to be transmitted over the network to a remote computer or that has been received from the network from a remote computer. That layer is governed by a set of strict set rules called the Internet Protocol or IP originally specified by [RFC 791] in september 1981. What is pretty amazing with IP is that, although its original RFC has been amended by several others since 1981, its specification remains absolutely valid! If have a look at [RFC 791] you won't see "obsoleted". Sure IPv4 reached its limits in this first half the XXIst century but will remains in the IT landscape for probably several years to not say decades (you know, the COBOL language....). To finish on historical details, you might find interesting know that TCP/IP was not the original protocol suite used on the ARAPANet, it superseded in 1983 another protocol suite the Network Control Program. NCP looks like, from our point of view, quite prehistoric but it is of big importance as it established a lot of concepts still in use today : PDU, splitting an address in various components, connection management and so on comes from NCP. Historical reward for those who are still reading this long paragraph: first, even a computer user was addressable in NCP messages second even in 1970 the engineers were concerned by network congestions issues (this page).

Let's go back to those good old seventies: the engineers who designed the Internet Protocol retained a 32 bits addressing scheme for IP and, afterall, the ARAPnet will never have the need to be able to address billions of hosts! If you look at some ARAPANet diagrams it counted less than 100 hosts in

who would ever need millions of addresses afterall? So in theory with those 32 bits we can have around 4 billions of computers within that network and arbitrarily retain that the very first connected computer must be given the number "0", the second one "1", the third one "2" and so on until we exhaust the address pool at number 4294967295 giving no more than 4294967296 (2^32) computers on that network because no number can be a duplicate.

Classful and classless networks

Those addresses follows the thereafter logic:

32 bits (fixed length)
Network part (variable length of N bits ) Host part (length : 32 - N bits)
  • The network address : this part is uniquely assigned amongst all of the organizations in the world (i.e. No one in the world can hold the same network part)
  • The host address : unique within a given network part

So in theory we can have something like this (remember the network nature is not to be unique, it hs to be be a collection of networks  :

  • Network 1 Host 1


Just like your birthday cake is divided in more or less smaller parts depending on how guests' appetite, the IPv4 address space has also been divided into more or less smaller parts just because organizations needs more or less computers on their networks. How to make this possible? Simply by dedicating a variable number of bits to the network part! Do you see the consequence? An IPv4 address being always 32 bits wide, the more bits you dedicate to the network part the lesser you have for the host part and vice-versa, this is a tradeoff, always. Basically, having more bits in :

  • the network part : means more networks possible at the cost of having less hosts per network
  • the host part : means less networks but more hosts per network

It might sounds a bit abstract but let's take an example : imagine we dedicate only 8 bits for the network part and the remaining 24 for the hosts part. What happens? First if we only


Is the network part assigned by each organization to itself? Of course not! Assignment are coordinated at the worldwide level by what we call Regional Internet Registries or RIRs which, in turn, can delegate assignments to third-parties located within their geographic jurisdiction. Those latter are called Local Internet Registries or LIRs (the system is detailed in RFC 7020). All of those RIRs are themselves put under the responsibility of now now well-known Internet Assigned Numbers Authority or IANA. As of 2014 five RIR exists :

  • ARIN (American Registry for Internet Numbers) : covers North America
  • LACNIC (Latin America and Caribbean Network Information Centre): covers South America and the Caribbean
  • RIPE-NCC (Réseaux IP Européens / or RIPE Network Coordination Centre): covers Europe, Russia and middle east
  • Afrinic (Africa Network Information Center) : covers the whole Africa
  • APNIC (Asian and Pacific Network Information Centre) : covers oceania and far east.