Difference between pages "Install/pt-br/Configuring" and "Linux Containers"

< Install(Difference between pages)
(Configurando seu sistema)
 
({{c|/lxc/funtoo0/config to /etc/lxc/funtoo0.conf }})
 
Line 1: Line 1:
=== Configurando seu sistema ===
+
Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server.  
Como é esperado de uma distribuição Linux, Funtoo Linux tem seu compartilhamento de arquivos de configuração. O arquivo que absolutamente requer que você edite de forma a assegurar que o Funtoo Linux inicialize com sucesso é <code>/etc/fstab</code>. Os outros são opcionais.  
+
  
==== Utilizando o Nano ====
+
== Status ==
  
O editor padrão incluso no ambiente chroot é chamado de <code>nano</code>. Para editar um dos arquivos abaixo, chame o nano como a seguir:
+
As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see [[OpenVZ]].
  
<console>
+
LXC containers don't yet have their own system uptime, and they see everything that's in the host's {{c|dmesg}} output, among other things. But in general, the technology works.
(chroot) # ##i##nano /etc/fstab
+
</console>
+
  
When in the editor, you can use arrow keys to move the cursor, and common keys like backspace and delete will work as expected. To save the file, press Control-X, and answer <code>y</code> when prompted to save the modified buffer if you would like to save your changes.
+
== Basic Info ==
  
==== Configuration Files ====
 
  
Here are a full list of files that you may want to edit, depending on your needs:
+
* Linux Containers are based on:
{{TableStart}}
+
** Kernel namespaces for resource isolation
<tr class="active"><th>File</th>
+
** CGroups for resource limitation and accounting
<th>Do I need to change it?</th>
+
<th>Description</th>
+
</tr><tr  class="danger">
+
<td><code>/etc/fstab</code></td>
+
<td>'''YES - required'''</td>
+
<td>Mount points for all filesystems to be used at boot time. This file must reflect your disk partition setup. We'll guide you through modifying this file below.</td>
+
</tr><tr>
+
<td><code>/etc/localtime</code></td>
+
<td>''Maybe - recommended''</td>
+
<td>Your timezone, which will default to UTC if not set. This should be a symbolic link to something located under /usr/share/zoneinfo (e.g. /usr/share/zoneinfo/America/Montreal) </td>
+
</tr><tr>
+
<td><code>/etc/make.conf</code> (symlink) - also known as:<br/><code>/etc/portage/make.conf</code></td>
+
<td>''Maybe - recommended''</td>
+
<td>Parameters used by gcc (compiler), portage, and make. It's a good idea to set MAKEOPTS. This is covered later in this document.</td>
+
</tr><tr>
+
<td><code>/etc/conf.d/hostname</code></td>
+
<td>''Maybe - recommended''</td>
+
<td>Used to set system hostname. Set the <code>hostname</code> variable to the fully-qualified (with dots, ie. <code>foo.funtoo.org</code>) name if you have one. Otherwise, set to the local system hostname (without dots, ie. <code>foo</code>). Defaults to <code>localhost</code> if not set.</td>
+
</tr><tr>
+
<td><code>/etc/hosts</code></td>
+
<td>''No''</td>
+
<td> You no longer need to manually set the hostname in this file. This file is automatically generated by <code>/etc/init.d/hostname</code>.</td>
+
</tr><tr>
+
<td><code>/etc/conf.d/keymaps</code></td>
+
<td>Optional</td>
+
<td>Keyboard mapping configuration file (for console pseudo-terminals). Set if you have a non-US keyboard. See [[Funtoo Linux Localization]].</td>
+
</tr><tr>
+
<td><code>/etc/conf.d/hwclock</code></td>
+
<td>Optional</td>
+
<td>How the time of the battery-backed hardware clock of the system is interpreted (UTC or local time). Linux uses the battery-backed hardware clock to initialize the system clock when the system is booted.</td>
+
</tr><tr>
+
<td><code>/etc/conf.d/modules</code></td>
+
<td>Optional</td>
+
<td>Kernel modules to load automatically at system startup. Typically not required. See [[Additional Kernel Resources]] for more info.</td>
+
</tr><tr>
+
<td><code>/etc/conf.d/consolefont</code></td>
+
<td>Optional</td>
+
<td>Allows you to specify the default console font. To apply this font, enable the consolefont service by running rc-update add consolefont.</td>
+
</tr><tr>
+
<td><code>profiles</code></td>
+
<td>Optional</td>
+
<td>Some useful portage settings that may help speed up intial configuration.</td>
+
</tr>
+
{{TableEnd}}
+
  
If you're installing an English version of Funtoo Linux, you're in luck as most of the configuration files can be used as-is. If you're installing for another locale, don't worry. We will walk you through the necessary configuration steps on the [[Funtoo Linux Localization]] page, and if needed, there's always plenty of friendly, helpful support. (See [[#Community portal|Community]])
+
{{Package|app-emulation/lxc}} is the userspace tool for Linux containers
  
Let's go ahead and see what we have to do. Use <code>nano -w <name_of_file></code> to edit files -- the "<code>-w</code>" disables word-wrapping, which is handy when editing configuration files. You can copy and paste from the examples.
+
== Control groups ==
  
{{fancywarning|It's important to edit your <code>/etc/fstab</code> file before you reboot! You will need to modify both the "fs" and "type" columns to match the settings for your partitions and filesystems that you created with <code>gdisk</code> or <code>fdisk</code>. Skipping this step may prevent Funtoo Linux from booting successfully.}}
+
* Control groups (cgroups) in kernel since 2.6.24
 +
** Allows aggregation of tasks and their children
 +
** Subsystems (cpuset, memory, blkio,...)
 +
** accounting - to measure how much resources certain systems use
 +
** resource limiting - groups can be set to not exceed a set memory limit
 +
** prioritization - some groups may get a larger share of CPU
 +
** control - freezing/unfreezing of cgroups, checkpointing and restarting
 +
** No disk quota limitation ( -> image file, LVM, XFS, directory tree quota,...)
  
==== /etc/fstab ====
+
== Subsystems ==
 +
<br>
 +
{{console|body=
 +
###i## cat /proc/cgroups
 +
subsys_name hierarchy num_cgroups enabled
 +
cpuset
 +
cpu
 +
cpuacct
 +
memory
 +
devices
 +
freezer
 +
blkio
 +
perf_event
 +
hugetlb
 +
}}
  
<code>/etc/fstab</code> is used by the <code>mount</code> command which is ran when your system boots. Statements of this file inform <code>mount</code> about partitions to be mounted and how they are mounted. In order for the system to boot properly, you must edit <code>/etc/fstab</code> and ensure that it reflects the partition configuration you used earlier:
+
#cpuset    -> limits tasks to specific CPU/CPUs
 +
#cpu        -> CPU shares
 +
#cpuacct    -> CPU accounting
 +
#memory    -> memory and swap limitation and accounting
 +
#devices    -> device allow deny list
 +
#freezer    -> suspend/resume tasks
 +
#blkio      -> I/O priorization (weight, throttle, ...)
 +
#perf_event -> support for per-cpu per-cgroup monitoring [http://lwn.net/Articles/421574/ perf_events]
 +
#hugetlb    -> cgroup resource controller for HugeTLB pages  [http://lwn.net/Articles/499255/ hugetlb]
  
<console>
+
== Configuring the Funtoo Host System ==
(chroot) # ##i##nano -w /etc/fstab
+
</console>
+
  
<pre>
+
=== Install LXC kernel ===
# The root filesystem should have a pass number of either 0 or 1.
+
Any kernel beyond 3.1.5 will probably work. Personally I prefer {{Package|sys-kernel/gentoo-sources}} as these have support for all the namespaces without sacrificing the xfs, FUSE or NFS support for example. These checks were introduced later starting from kernel 3.5, this could also mean that the user namespace is not working optimally.
# All other filesystems should have a pass number of 0 or greater than 1.
+
#
+
# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
+
#
+
# See the manpage fstab(5) for more information.
+
#
+
# <fs>     <mountpoint>  <type>  <opts>        <dump/pass>
+
  
/dev/sda1    /boot        ext2    noauto,noatime 1 2
+
* User namespace (EXPERIMENTAL) depends on EXPERIMENTAL and on UIDGID_CONVERTED
/dev/sda2    none          swap    sw            0 0
+
** config UIDGID_CONVERTED
/dev/sda3    /            ext4    noatime        0 1
+
*** True if all of the selected software components are known to have uid_t and gid_t converted to kuid_t and kgid_t where appropriate and are otherwise safe to use with the user namespace.
#/dev/cdrom  /mnt/cdrom    auto    noauto,ro      0 0
+
**** Networking - depends on NET_9P = n
</pre>
+
**** Filesystems - 9P_FS = n, AFS_FS = n, AUTOFS4_FS = n, CEPH_FS = n, CIFS = n, CODA_FS = n, FUSE_FS = n, GFS2_FS = n, NCP_FS = n, NFSD = n, NFS_FS = n, OCFS2_FS = n, XFS_FS = n
 +
**** Security options - Grsecurity - GRKERNSEC = n (if applicable)
  
{{Note|Currently, our default <code>/etc/fstab</code> has the root filesystem as <code>/dev/sda4</code> and the swap partition as <code>/dev/sda3</code>. These will need to be changed to <code>/dev/sda3</code> and <code>/dev/sda2</code>, respectively.}}
+
** As of 3.10.xx kernel, all of the above options are safe to use with User namespaces, except for XFS_FS, therefore with kernel >=3.10.xx, you should answer XFS_FS = n, if you want User namespaces support.
 +
** in your kernel source directory, you should check init/Kconfig and find out what UIDGID_CONVERTED depends on
  
{{Note|If you're using UEFI to boot, change the <code>/dev/sda1</code> line so it says <code>vfat</code> instead of <code>ext2</code>. Similarly, make sure that the <code>/dev/sda3</code> line specifies either <code>xfs</code> or <code>ext4</code>, depending on which filesystem you chose at filesystem-creation time.}}
+
==== Kernel configuration ====
 +
These options should be enable in your kernel to be able to take full advantage of LXC.
  
==== /etc/localtime ====
+
* General setup
 +
** CONFIG_NAMESPACES
 +
*** CONFIG_UTS_NS
 +
*** CONFIG_IPC_NS
 +
*** CONFIG_PID_NS
 +
*** CONFIG_NET_NS
 +
*** CONFIG_USER_NS
 +
** CONFIG_CGROUPS
 +
*** CONFIG_CGROUP_DEVICE
 +
*** CONFIG_CGROUP_SCHED
 +
*** CONFIG_CGROUP_CPUACCT
 +
*** CONFIG_CGROUP_MEM_RES_CTLR (in 3.6+ kernels it's called CONFIG_MEMCG)
 +
*** CONFIG_CGROUP_MEM_RES_CTLR_SWAP (in 3.6+ kernels it's called CONFIG_MEMCG_SWAP)
 +
*** CONFIG_CPUSETS (on multiprocessor hosts)
 +
* Networking support
 +
** Networking options
 +
*** CONFIG_VLAN_8021Q
 +
* Device Drivers
 +
** Character devices
 +
*** Unix98 PTY support
 +
**** CONFIG_DEVPTS_MULTIPLE_INSTANCES
 +
** Network device support
 +
*** Network core driver support
 +
**** CONFIG_VETH
 +
**** CONFIG_MACVLAN
  
<code>/etc/localtime</code> is used to specify the timezone that your machine is in, and defaults to UTC. If you would like your Funtoo Linux system to use local time, you should replace <code>/etc/localtime</code> with a symbolic link to the timezone that you wish to use.
+
Once you have lxc installed, you can then check your kernel config with:
 +
{{console|body=
 +
# ##i##CONFIG=/path/to/config /usr/sbin/lxc-checkconfig
 +
}}
  
<console>
+
=== Emerge lxc ===
(chroot) # ##i##ln -sf /usr/share/zoneinfo/MST7MDT /etc/localtime
+
{{console|body=
</console>
+
# ##i##emerge app-emulation/lxc
 +
}}
  
The above sets the timezone to Mountain Standard Time (with daylight savings). Type <code>ls /usr/share/zoneinfo</code> to see what timezones are available. There are also sub-directories containing timezones described by location.
+
=== Configure Networking For Container ===
  
==== /etc/make.conf ====
+
Typically, one uses a bridge to allow containers to connect to the network. This is how to do it under Funtoo Linux:
  
MAKEOPTS can be used to define how many parallel compilations should occur when you compile a package, which can speed up compilation significantly. A rule of thumb is the number of CPUs (or CPU threads) in your system plus one. If for example you have a dual core processor without [[wikipedia:Hyper-threading|hyper-threading]], then you would set MAKEOPTS to 3:
+
# create a bridge using the Funtoo network configuration scripts. Name the bridge something like {{c|brwan}} (using {{c|/etc/init.d/netif.brwan}}). Configure your bridge to have an IP address.
 +
# Make your physical interface, such as {{c|eth0}}, an interface with no IP address (use the Funtoo {{c|interface-noip}} template.)
 +
# Make {{c|netif.eth0}} a slave of {{c|netif.brwan}} in {{c|/etc/conf.d/netif.brwan}}.
 +
# Enable your new bridged network and make sure it is functioning properly on the host.
  
<pre>
+
You will now be able to configure LXC to automatically add your container's virtual ethernet interface to the bridge when it starts, which will connect it to your network.
MAKEOPTS="-j3"
+
</pre>
+
  
If you are unsure about how many processors/threads you have then use nproc to help you.
+
== Setting up a Funtoo Linux LXC Container ==
<console>
+
(chroot) # ##i##nproc
+
16
+
</console>
+
  
Set MAKEOPTS to this number plus one:
+
Here are the steps required to get Funtoo Linux running <i>inside</i> a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use [[Metro]] to build an lxc container tarball directly, which will save you manual configuration steps and will provide an {{c|/etc/fstab.lxc}} file that you can use for your host container config. See [[Metro Recipes]] for info on how to use Metro to generate an lxc container.
  
 +
=== Create and Configure Container Filesystem ===
 +
 +
# Start with a Funtoo LXC template, and unpack it to a directory such as {{c|/lxc/funtoo0/rootfs/}}
 +
# Create an empty {{c|/lxc/funtoo0/fstab}} file
 +
# Ensure {{c|c1}} line is uncommented (enabled) and {{c|c2}} through {{c|c6}} lines are disabled in {{c|/lxc/funtoo0/rootfs/etc/inittab}}
 +
 +
That's almost all you need to get the container filesystem ready to start.
 +
 +
=== Create Container Configuration Files ===
 +
 +
Create the following files:
 +
 +
==== {{c|/lxc/funtoo0/config}} ====
 +
 +
 +
and also create symlink from
 +
==== {{c|/lxc/funtoo0/config to /etc/lxc/funtoo0/config }} ====
 +
{{console|body=
 +
###i## install -d /etc/lxc/funtoo0
 +
###i## ln -s /lxc/funtoo0/config /etc/lxc/funtoo0/config
 +
}}
 +
 +
{{note| Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies. // note by Havis to Daniel, this config is already superior.}}
 +
 +
 +
Read "man 5 lxc.conf" , to get more information about linux container configuration file.
 
<pre>
 
<pre>
MAKEOPTS="-j17"
+
## Container
 +
lxc.utsname                            = funtoo0
 +
lxc.rootfs                              = /lxc/funtoo0/rootfs/
 +
lxc.arch                                = x86_64
 +
#lxc.console                            = /var/log/lxc/funtoo0.console  # uncomment if you want to log containers console
 +
lxc.tty                                = 6  # if you plan to use container with physical terminals (eg F1..F6)
 +
#lxc.tty                                = 0  # set to 0 if you dont plan to use the container with physical terminal, also comment out in your containers /etc/inittab  c1 to c6 respawns (e.g. c1:12345:respawn:/sbin/agetty 38400 tty1 linux)
 +
lxc.pts                                = 1024
 +
 
 +
 
 +
## Capabilities
 +
lxc.cap.drop                            = audit_control
 +
lxc.cap.drop                            = audit_write
 +
lxc.cap.drop                            = mac_admin
 +
lxc.cap.drop                            = mac_override
 +
lxc.cap.drop                            = mknod
 +
lxc.cap.drop                            = setfcap
 +
lxc.cap.drop                            = setpcap
 +
lxc.cap.drop                            = sys_admin
 +
#lxc.cap.drop                            = sys_boot # capability to reboot the container
 +
#lxc.cap.drop                            = sys_chroot # required by SSH
 +
lxc.cap.drop                            = sys_module
 +
#lxc.cap.drop                            = sys_nice
 +
lxc.cap.drop                            = sys_pacct
 +
lxc.cap.drop                            = sys_rawio
 +
lxc.cap.drop                            = sys_resource
 +
lxc.cap.drop                            = sys_time
 +
#lxc.cap.drop                            = sys_tty_config # required by getty
 +
 
 +
## Devices
 +
#lxc.cgroup.devices.allow              = a # Allow access to all devices
 +
lxc.cgroup.devices.deny                = a # Deny access to all devices
 +
 
 +
# Allow to mknod all devices (but not using them)
 +
lxc.cgroup.devices.allow                = c *:* m
 +
lxc.cgroup.devices.allow                = b *:* m
 +
 
 +
lxc.cgroup.devices.allow                = c 1:3 rwm # /dev/null
 +
lxc.cgroup.devices.allow                = c 1:5 rwm # /dev/zero
 +
lxc.cgroup.devices.allow                = c 1:7 rwm # /dev/full
 +
lxc.cgroup.devices.allow                = c 1:8 rwm # /dev/random
 +
lxc.cgroup.devices.allow                = c 1:9 rwm # /dev/urandom
 +
#lxc.cgroup.devices.allow                = c 4:0 rwm # /dev/tty0 ttys not required if you have lxc.tty = 0
 +
#lxc.cgroup.devices.allow                = c 4:1 rwm # /dev/tty1 devices with major number 4 are "real" tty devices
 +
#lxc.cgroup.devices.allow                = c 4:2 rwm # /dev/tty2
 +
#lxc.cgroup.devices.allow                = c 4:3 rwm # /dev/tty3
 +
lxc.cgroup.devices.allow                = c 5:0 rwm # /dev/tty
 +
lxc.cgroup.devices.allow                = c 5:1 rwm # /dev/console
 +
lxc.cgroup.devices.allow                = c 5:2 rwm # /dev/ptmx
 +
lxc.cgroup.devices.allow                = c 10:229 rwm # /dev/fuse
 +
lxc.cgroup.devices.allow                = c 136:* rwm # /dev/pts/* devices with major number 136 are pts
 +
lxc.cgroup.devices.allow                = c 254:0 rwm # /dev/rtc0
 +
 
 +
## Limits#
 +
lxc.cgroup.cpu.shares                  = 1024
 +
lxc.cgroup.cpuset.cpus                = 0        # limits container to CPU0
 +
lxc.cgroup.memory.limit_in_bytes      = 512M
 +
lxc.cgroup.memory.memsw.limit_in_bytes = 1G
 +
#lxc.cgroup.blkio.weight                = 500      # requires cfq block scheduler
 +
 
 +
## Filesystem
 +
#containers fstab should be outside it's rootfs dir (e.g. /lxc/funtoo0/fstab is ok, but /lxc/funtoo0/rootfs/etc/fstab is wrong!!!)
 +
#lxc.mount                              = /lxc/funtoo0/fstab     
 +
 
 +
#lxc.mount.entry is prefered, because it supports relative paths
 +
lxc.mount.entry                        = proc proc proc nosuid,nodev,noexec  0 0
 +
lxc.mount.entry                        = sysfs sys sysfs nosuid,nodev,noexec,ro 0 0
 +
lxc.mount.entry                        = devpts dev/pts devpts nosuid,noexec,mode=0620,ptmxmode=000,newinstance 0 0
 +
lxc.mount.entry                        = tmpfs dev/shm tmpfs nosuid,nodev,mode=1777 0 0
 +
lxc.mount.entry                        = tmpfs run tmpfs nosuid,nodev,noexec,mode=0755,size=128m 0 0
 +
lxc.mount.entry                        = tmpfs tmp tmpfs nosuid,nodev,noexec,mode=1777,size=1g 0 0
 +
 
 +
##Example of having /var/tmp/portage as tmpfs in container
 +
#lxc.mount.entry                        = tmpfs var/tmp/portage tmpfs defaults,size=8g,uid=250,gid=250,mode=0775 0 0
 +
##Example of bind mount
 +
#lxc.mount.entry                        = /srv/funtoo0 /lxc/funtoo0/rootfs/srv/funtoo0 none defaults,bind 0 0
 +
 
 +
## Network
 +
lxc.network.type                        = veth
 +
lxc.network.flags                      = up
 +
lxc.network.hwaddr                      = #put your MAC address here, otherwise you will get a random one
 +
lxc.network.link                        = br0
 +
lxc.network.name                        = eth0
 +
#lxc.network.veth.pair                  = veth-example
 
</pre>
 
</pre>
  
USE flags define what functionality is enabled when packages are built. It is not recommended to add a lot of them during installation; you should wait until you have a working, bootable system before changing your USE flags. A USE flag prefixed with a minus ("<code>-</code>") sign tells Portage not to use the flag when compiling.  A Funtoo guide to USE flags will be available in the future. For now, you can find out more information about USE flags in the [http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=2&chap=2 Gentoo Handbook].
+
Read "man 7 capabilities" to get more information aboout Linux capabilities.
  
LINGUAS tells Portage which local language to compile the system and applications in (those who use LINGUAS variable like OpenOffice). It is not usually necessary to set this if you use English. If you want another language such as French (fr) or German (de), set LINGUAS appropriately:
+
Above, use the following command to generate a random MAC for {{c|lxc.network.hwaddr}}:
 +
 
 +
{{console|body=
 +
###i## openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//'
 +
}}
 +
 
 +
It is a very good idea to assign a static MAC address to your container using {{c|lxc.network.hwaddr}}. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant.
 +
 
 +
It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as {{c|/etc/lxc/hwaddr.sh}}, make it executable and run it like {{c|/etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx}} where xxx.xxx.xxx.xxx represents your Container IP. <br>{{c|/etc/lxc/hwaddr.sh}}:
  
 
<pre>
 
<pre>
LINGUAS="fr"
+
#!/bin/sh
 +
IP=$*
 +
HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }`
 +
echo $HA
 
</pre>
 
</pre>
  
==== /etc/conf.d/hwclock ====
+
==== {{c|/lxc/funtoo0/fstab}} ====
If you dual-boot with Windows, you'll need to edit this file and change the value of '''clock''' from '''UTC''' to '''local''', because Windows will set your hardware clock to local time every time you boot Windows. Otherwise you normally wouldn't need to edit this file.
+
{{fancynote| It is now preferable to have mount entries directly in config file instead of separate fstab:}}
<console>
+
Edit the file {{c|/lxc/funtoo0/fstab}}:
(chroot) # ##i##nano -w /etc/conf.d/hwclock
+
<pre>
</console>
+
none /lxc/funtoo0/dev/pts devpts defaults 0 0
 +
none /lxc/funtoo0/proc proc defaults 0 0
 +
none /lxc/funtoo0/sys sysfs defaults 0 0
 +
none /lxc/funtoo0/dev/shm tmpfs nodev,nosuid,noexec,mode=1777,rw 0 0
 +
</pre>
 +
 
 +
== LXC Networking ==
 +
*veth - Virtual Ethernet (bridge)
 +
*vlan - vlan interface (requires device able to do vlan tagging)
 +
*macvlan (mac-address based virtual lan tagging) has 3 modes:
 +
**private
 +
**vepa (Virtual Ethernet Port Aggregator)
 +
**bridge
 +
*phys - dedicated host NIC
 +
[https://blog.flameeyes.eu/2010/09/linux-containers-and-networking Linux Containers and Networking]
 +
 
 +
Enable routing on the host:
 +
By default Linux workstations and servers have IPv4 forwarding disabled.
 +
{{console|body=
 +
###i## echo "1" > /proc/sys/net/ipv4/ip_forward
 +
###i## cat /proc/sys/net/ipv4/ip_forward
 +
# 1
 +
}}
 +
 
 +
== Initializing and Starting the Container ==
 +
 
 +
You will probably need to set the root password for the container before you can log in. You can use chroot to do this quickly:
 +
 
 +
{{console|body=
 +
###i## chroot /lxc/funtoo0/rootfs
 +
(chroot) ###i## passwd
 +
New password: XXXXXXXX
 +
Retype new password: XXXXXXXX
 +
passwd: password updated successfully
 +
(chroot) ###i## exit
 +
}}
 +
 
 +
Now that the root password is set, run:
 +
 
 +
{{console|body=
 +
###i## lxc-start -n funtoo0 -d
 +
}}
 +
 
 +
The {{c|-d}} option will cause it to run in the background.
 +
 
 +
To attach to the console:
 +
 
 +
{{console|body=
 +
###i## lxc-console -n funtoo0
 +
}}
 +
 
 +
You should now be able to log in and use the container. In addition, the container should now be accessible on the network.
 +
 
 +
To directly attach to container:
 +
 
 +
{{console|body=
 +
###i## lxc-attach -n funtoo0
 +
}}
 +
 
 +
To stop the container:
 +
 
 +
{{console|body=
 +
###i## lxc-stop -n funtoo0
 +
}}
 +
 
 +
Ensure that networking is working from within the container while it is running, and you're good to go!
 +
 
 +
== Starting LXC container during host boot ==
 +
 
 +
# You need to create symlink in {{c|/etc/init.d/}} to {{c|/etc/init.d/lxc}} so that it reflects your container.
 +
# {{c|ln -s /etc/init.d/lxc /etc/init.d/lxc.funtoo0}}
 +
# now you can add {{c|lxc.funtoo0}} to default runlevel
 +
# {{c|rc-update add lxc.funtoo0 default}}
 +
{{console|body=
 +
###i## rc
 +
* Starting funtoo0 ...                  [ ok ]
 +
}}
 +
 
 +
== LXC Bugs/Missing Features ==
 +
 
 +
This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions.
 +
 
 +
=== reboot ===
 +
 
 +
* By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it.
 +
* If you want your container to reboot gracefully, you need sys_boot capability (comment out lxc.cap.drop = sys_boot in your container config)
 +
 
 +
=== PID namespaces ===
 +
 
 +
Process ID namespaces are functional, but the container can still see the CPU utilization of the host via the system load (ie. in {{c|top}}).
 +
 
 +
=== /dev/pts newinstance ===
 +
 
 +
* Some changes may be required to the host to properly implement "newinstance" {{c|/dev/pts}}. See [https://bugzilla.redhat.com/show_bug.cgi?id=501718 This Red Hat bug].
 +
 
 +
=== lxc-create and lxc-destroy ===
 +
 
 +
* LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy.
 +
 
 +
=== network initialization and cleanup ===
 +
 
 +
* If used network.type = phys after lxc-stop the interface will be renamed to value from lxc.network.link. It supposed to be fixed in 0.7.4, happens still on 0.7.5 - http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01760.html
 +
 
 +
* Re-starting a container can result in a failure as network resource are tied up from the already-defunct instance: [http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg00824.html]
 +
 
 +
=== graceful shutdown ===
 +
 
 +
* To gracefully shutdown a container, it's init system needs to properly handle kill -PWR signal
 +
* For funtoo/gentoo make sure that you have:
 +
** pf:12345:powerwait:/sbin/halt
 +
** in your containers /etc/inittab
 +
* For debian/ubuntu make sure that you have:
 +
** pf::powerwait:/sbin/shutdown -t1 -a -h now
 +
** in your container /etc/inittab
 +
** and also comment out other line starting with pf:powerfail (such as pf::powerwait:/etc/init.d/powerfail start) <- these are used if you have UPS monitoring daemon installed!
 +
* /etc/init.d/lxc seems to have broken support for graceful shutdown (it sends proper signal, but then also tries to kill the init with lxc-stop)
 +
 
 +
=== funtoo ===
 +
 
 +
* Our udev should be updated to contain {{c|-lxc}} in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.)
 +
* Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to {{c|/libexec/rc/init.d}} using the container-specific {{c|fstab}} file (on the host.)
 +
* Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed.
 +
 
 +
== References ==
 +
 
 +
* {{c|man 7 capabilities}}
 +
* {{c|man 5 lxc.conf}}
 +
 
 +
== Links ==
 +
 
 +
* There are a number of additional lxc features that can be enabled via patches: [http://lxc.sourceforge.net/patches/linux/3.0.0/3.0.0-lxc1/]
 +
* [https://wiki.ubuntu.com/UserNamespace Ubuntu User Namespaces page]
 +
* lxc-gentoo setup script [https://github.com/globalcitizen/lxc-gentoo on GitHub]
 +
 
 +
* '''IBM developerWorks'''
 +
** [http://www.ibm.com/developerworks/linux/library/l-lxc-containers/index.html LXC: Linux Container Tools]
 +
** [http://www.ibm.com/developerworks/linux/library/l-lxc-security/ Secure Linux Containers Cookbook]
  
==== Localization ====
+
* '''Linux Weekly News'''
 +
** [http://lwn.net/Articles/244531/ Smack for simplified access control]
  
By default, Funtoo Linux is configured with Unicode (UTF-8) enabled, and for the US English locale and keyboard. If you would like to configure your system to use a non-English locale or keyboard, see [[Funtoo Linux Localization]].
+
[[Category:Labs]]
 +
[[Category:HOWTO]]
 +
[[Category:Virtualization]]

Revision as of 21:59, January 29, 2015

Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server.

Status

As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see OpenVZ.

LXC containers don't yet have their own system uptime, and they see everything that's in the host's dmesg output, among other things. But in general, the technology works.

Basic Info

  • Linux Containers are based on:
    • Kernel namespaces for resource isolation
    • CGroups for resource limitation and accounting

Package:LXC is the userspace tool for Linux containers

Control groups

  • Control groups (cgroups) in kernel since 2.6.24
    • Allows aggregation of tasks and their children
    • Subsystems (cpuset, memory, blkio,...)
    • accounting - to measure how much resources certain systems use
    • resource limiting - groups can be set to not exceed a set memory limit
    • prioritization - some groups may get a larger share of CPU
    • control - freezing/unfreezing of cgroups, checkpointing and restarting
    • No disk quota limitation ( -> image file, LVM, XFS, directory tree quota,...)

Subsystems


# cat /proc/cgroups 
subsys_name	hierarchy	num_cgroups	enabled
cpuset	
cpu	
cpuacct	
memory	
devices	
freezer	
blkio	
perf_event
hugetlb


  1. cpuset -> limits tasks to specific CPU/CPUs
  2. cpu -> CPU shares
  3. cpuacct -> CPU accounting
  4. memory -> memory and swap limitation and accounting
  5. devices -> device allow deny list
  6. freezer -> suspend/resume tasks
  7. blkio -> I/O priorization (weight, throttle, ...)
  8. perf_event -> support for per-cpu per-cgroup monitoring perf_events
  9. hugetlb -> cgroup resource controller for HugeTLB pages hugetlb

Configuring the Funtoo Host System

Install LXC kernel

Any kernel beyond 3.1.5 will probably work. Personally I prefer sys-kernel/gentoo-sources (package not on wiki - please add) as these have support for all the namespaces without sacrificing the xfs, FUSE or NFS support for example. These checks were introduced later starting from kernel 3.5, this could also mean that the user namespace is not working optimally.

  • User namespace (EXPERIMENTAL) depends on EXPERIMENTAL and on UIDGID_CONVERTED
    • config UIDGID_CONVERTED
      • True if all of the selected software components are known to have uid_t and gid_t converted to kuid_t and kgid_t where appropriate and are otherwise safe to use with the user namespace.
        • Networking - depends on NET_9P = n
        • Filesystems - 9P_FS = n, AFS_FS = n, AUTOFS4_FS = n, CEPH_FS = n, CIFS = n, CODA_FS = n, FUSE_FS = n, GFS2_FS = n, NCP_FS = n, NFSD = n, NFS_FS = n, OCFS2_FS = n, XFS_FS = n
        • Security options - Grsecurity - GRKERNSEC = n (if applicable)
    • As of 3.10.xx kernel, all of the above options are safe to use with User namespaces, except for XFS_FS, therefore with kernel >=3.10.xx, you should answer XFS_FS = n, if you want User namespaces support.
    • in your kernel source directory, you should check init/Kconfig and find out what UIDGID_CONVERTED depends on

Kernel configuration

These options should be enable in your kernel to be able to take full advantage of LXC.

  • General setup
    • CONFIG_NAMESPACES
      • CONFIG_UTS_NS
      • CONFIG_IPC_NS
      • CONFIG_PID_NS
      • CONFIG_NET_NS
      • CONFIG_USER_NS
    • CONFIG_CGROUPS
      • CONFIG_CGROUP_DEVICE
      • CONFIG_CGROUP_SCHED
      • CONFIG_CGROUP_CPUACCT
      • CONFIG_CGROUP_MEM_RES_CTLR (in 3.6+ kernels it's called CONFIG_MEMCG)
      • CONFIG_CGROUP_MEM_RES_CTLR_SWAP (in 3.6+ kernels it's called CONFIG_MEMCG_SWAP)
      • CONFIG_CPUSETS (on multiprocessor hosts)
  • Networking support
    • Networking options
      • CONFIG_VLAN_8021Q
  • Device Drivers
    • Character devices
      • Unix98 PTY support
        • CONFIG_DEVPTS_MULTIPLE_INSTANCES
    • Network device support
      • Network core driver support
        • CONFIG_VETH
        • CONFIG_MACVLAN

Once you have lxc installed, you can then check your kernel config with:

# CONFIG=/path/to/config /usr/sbin/lxc-checkconfig


Emerge lxc

# emerge app-emulation/lxc


Configure Networking For Container

Typically, one uses a bridge to allow containers to connect to the network. This is how to do it under Funtoo Linux:

  1. create a bridge using the Funtoo network configuration scripts. Name the bridge something like brwan (using /etc/init.d/netif.brwan). Configure your bridge to have an IP address.
  2. Make your physical interface, such as eth0, an interface with no IP address (use the Funtoo interface-noip template.)
  3. Make netif.eth0 a slave of netif.brwan in /etc/conf.d/netif.brwan.
  4. Enable your new bridged network and make sure it is functioning properly on the host.

You will now be able to configure LXC to automatically add your container's virtual ethernet interface to the bridge when it starts, which will connect it to your network.

Setting up a Funtoo Linux LXC Container

Here are the steps required to get Funtoo Linux running inside a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use Metro to build an lxc container tarball directly, which will save you manual configuration steps and will provide an /etc/fstab.lxc file that you can use for your host container config. See Metro Recipes for info on how to use Metro to generate an lxc container.

Create and Configure Container Filesystem

  1. Start with a Funtoo LXC template, and unpack it to a directory such as /lxc/funtoo0/rootfs/
  2. Create an empty /lxc/funtoo0/fstab file
  3. Ensure c1 line is uncommented (enabled) and c2 through c6 lines are disabled in /lxc/funtoo0/rootfs/etc/inittab

That's almost all you need to get the container filesystem ready to start.

Create Container Configuration Files

Create the following files:

/lxc/funtoo0/config

and also create symlink from

/lxc/funtoo0/config to /etc/lxc/funtoo0/config

# install -d /etc/lxc/funtoo0
# ln -s /lxc/funtoo0/config /etc/lxc/funtoo0/config


Note
Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies. // note by Havis to Daniel, this config is already superior.


Read "man 5 lxc.conf" , to get more information about linux container configuration file.

## Container
lxc.utsname                             = funtoo0
lxc.rootfs                              = /lxc/funtoo0/rootfs/
lxc.arch                                = x86_64
#lxc.console                            = /var/log/lxc/funtoo0.console  # uncomment if you want to log containers console
lxc.tty                                 = 6  # if you plan to use container with physical terminals (eg F1..F6)
#lxc.tty                                = 0  # set to 0 if you dont plan to use the container with physical terminal, also comment out in your containers /etc/inittab  c1 to c6 respawns (e.g. c1:12345:respawn:/sbin/agetty 38400 tty1 linux)
lxc.pts                                 = 1024


## Capabilities
lxc.cap.drop                            = audit_control
lxc.cap.drop                            = audit_write
lxc.cap.drop                            = mac_admin
lxc.cap.drop                            = mac_override
lxc.cap.drop                            = mknod
lxc.cap.drop                            = setfcap
lxc.cap.drop                            = setpcap
lxc.cap.drop                            = sys_admin
#lxc.cap.drop                            = sys_boot # capability to reboot the container
#lxc.cap.drop                            = sys_chroot # required by SSH
lxc.cap.drop                            = sys_module
#lxc.cap.drop                            = sys_nice
lxc.cap.drop                            = sys_pacct
lxc.cap.drop                            = sys_rawio
lxc.cap.drop                            = sys_resource
lxc.cap.drop                            = sys_time
#lxc.cap.drop                            = sys_tty_config # required by getty

## Devices
#lxc.cgroup.devices.allow               = a # Allow access to all devices
lxc.cgroup.devices.deny                 = a # Deny access to all devices

# Allow to mknod all devices (but not using them)
lxc.cgroup.devices.allow                = c *:* m
lxc.cgroup.devices.allow                = b *:* m

lxc.cgroup.devices.allow                = c 1:3 rwm # /dev/null
lxc.cgroup.devices.allow                = c 1:5 rwm # /dev/zero
lxc.cgroup.devices.allow                = c 1:7 rwm # /dev/full
lxc.cgroup.devices.allow                = c 1:8 rwm # /dev/random
lxc.cgroup.devices.allow                = c 1:9 rwm # /dev/urandom
#lxc.cgroup.devices.allow                = c 4:0 rwm # /dev/tty0 ttys not required if you have lxc.tty = 0
#lxc.cgroup.devices.allow                = c 4:1 rwm # /dev/tty1 devices with major number 4 are "real" tty devices
#lxc.cgroup.devices.allow                = c 4:2 rwm # /dev/tty2
#lxc.cgroup.devices.allow                = c 4:3 rwm # /dev/tty3
lxc.cgroup.devices.allow                = c 5:0 rwm # /dev/tty
lxc.cgroup.devices.allow                = c 5:1 rwm # /dev/console
lxc.cgroup.devices.allow                = c 5:2 rwm # /dev/ptmx
lxc.cgroup.devices.allow                = c 10:229 rwm # /dev/fuse
lxc.cgroup.devices.allow                = c 136:* rwm # /dev/pts/* devices with major number 136 are pts
lxc.cgroup.devices.allow                = c 254:0 rwm # /dev/rtc0

## Limits#
lxc.cgroup.cpu.shares                  = 1024
lxc.cgroup.cpuset.cpus                 = 0        # limits container to CPU0
lxc.cgroup.memory.limit_in_bytes       = 512M
lxc.cgroup.memory.memsw.limit_in_bytes = 1G
#lxc.cgroup.blkio.weight                = 500      # requires cfq block scheduler

## Filesystem
#containers fstab should be outside it's rootfs dir (e.g. /lxc/funtoo0/fstab is ok, but /lxc/funtoo0/rootfs/etc/fstab is wrong!!!)
#lxc.mount                               = /lxc/funtoo0/fstab       

#lxc.mount.entry is prefered, because it supports relative paths
lxc.mount.entry                         = proc proc proc nosuid,nodev,noexec  0 0
lxc.mount.entry                         = sysfs sys sysfs nosuid,nodev,noexec,ro 0 0
lxc.mount.entry                         = devpts dev/pts devpts nosuid,noexec,mode=0620,ptmxmode=000,newinstance 0 0
lxc.mount.entry                         = tmpfs dev/shm tmpfs nosuid,nodev,mode=1777 0 0
lxc.mount.entry                         = tmpfs run tmpfs nosuid,nodev,noexec,mode=0755,size=128m 0 0
lxc.mount.entry                         = tmpfs tmp tmpfs nosuid,nodev,noexec,mode=1777,size=1g 0 0

##Example of having /var/tmp/portage as tmpfs in container 
#lxc.mount.entry                         = tmpfs var/tmp/portage tmpfs defaults,size=8g,uid=250,gid=250,mode=0775 0 0
##Example of bind mount
#lxc.mount.entry                        = /srv/funtoo0 /lxc/funtoo0/rootfs/srv/funtoo0 none defaults,bind 0 0

## Network
lxc.network.type                        = veth
lxc.network.flags                       = up
lxc.network.hwaddr                      = #put your MAC address here, otherwise you will get a random one
lxc.network.link                        = br0
lxc.network.name                        = eth0
#lxc.network.veth.pair                   = veth-example

Read "man 7 capabilities" to get more information aboout Linux capabilities.

Above, use the following command to generate a random MAC for lxc.network.hwaddr:

# openssl rand -hex 6


It is a very good idea to assign a static MAC address to your container using lxc.network.hwaddr. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant.

It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as /etc/lxc/hwaddr.sh, make it executable and run it like /etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx where xxx.xxx.xxx.xxx represents your Container IP.
/etc/lxc/hwaddr.sh:

#!/bin/sh
IP=$*
HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }`
echo $HA

/lxc/funtoo0/fstab

Note
It is now preferable to have mount entries directly in config file instead of separate fstab:

Edit the file /lxc/funtoo0/fstab:

none /lxc/funtoo0/dev/pts devpts defaults 0 0
none /lxc/funtoo0/proc proc defaults 0 0
none /lxc/funtoo0/sys sysfs defaults 0 0
none /lxc/funtoo0/dev/shm tmpfs nodev,nosuid,noexec,mode=1777,rw 0 0

LXC Networking

  • veth - Virtual Ethernet (bridge)
  • vlan - vlan interface (requires device able to do vlan tagging)
  • macvlan (mac-address based virtual lan tagging) has 3 modes:
    • private
    • vepa (Virtual Ethernet Port Aggregator)
    • bridge
  • phys - dedicated host NIC

Linux Containers and Networking

Enable routing on the host: By default Linux workstations and servers have IPv4 forwarding disabled.

# echo "1" > /proc/sys/net/ipv4/ip_forward
# cat /proc/sys/net/ipv4/ip_forward
# 1


Initializing and Starting the Container

You will probably need to set the root password for the container before you can log in. You can use chroot to do this quickly:

# chroot /lxc/funtoo0/rootfs
(chroot) # passwd
New password: XXXXXXXX
Retype new password: XXXXXXXX
passwd: password updated successfully
(chroot) # exit


Now that the root password is set, run:

# lxc-start -n funtoo0 -d


The -d option will cause it to run in the background.

To attach to the console:

# lxc-console -n funtoo0


You should now be able to log in and use the container. In addition, the container should now be accessible on the network.

To directly attach to container:

# lxc-attach -n funtoo0


To stop the container:

# lxc-stop -n funtoo0


Ensure that networking is working from within the container while it is running, and you're good to go!

Starting LXC container during host boot

  1. You need to create symlink in /etc/init.d/ to /etc/init.d/lxc so that it reflects your container.
  2. ln -s /etc/init.d/lxc /etc/init.d/lxc.funtoo0
  3. now you can add lxc.funtoo0 to default runlevel
  4. rc-update add lxc.funtoo0 default
# rc
 * Starting funtoo0 ...                  [ ok ]


LXC Bugs/Missing Features

This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions.

reboot

  • By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it.
  • If you want your container to reboot gracefully, you need sys_boot capability (comment out lxc.cap.drop = sys_boot in your container config)

PID namespaces

Process ID namespaces are functional, but the container can still see the CPU utilization of the host via the system load (ie. in top).

/dev/pts newinstance

  • Some changes may be required to the host to properly implement "newinstance" /dev/pts. See This Red Hat bug.

lxc-create and lxc-destroy

  • LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy.

network initialization and cleanup

  • Re-starting a container can result in a failure as network resource are tied up from the already-defunct instance: [1]

graceful shutdown

  • To gracefully shutdown a container, it's init system needs to properly handle kill -PWR signal
  • For funtoo/gentoo make sure that you have:
    • pf:12345:powerwait:/sbin/halt
    • in your containers /etc/inittab
  • For debian/ubuntu make sure that you have:
    • pf::powerwait:/sbin/shutdown -t1 -a -h now
    • in your container /etc/inittab
    • and also comment out other line starting with pf:powerfail (such as pf::powerwait:/etc/init.d/powerfail start) <- these are used if you have UPS monitoring daemon installed!
  • /etc/init.d/lxc seems to have broken support for graceful shutdown (it sends proper signal, but then also tries to kill the init with lxc-stop)

funtoo

  • Our udev should be updated to contain -lxc in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.)
  • Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to /libexec/rc/init.d using the container-specific fstab file (on the host.)
  • Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed.

References

  • man 7 capabilities
  • man 5 lxc.conf

Links