Difference between revisions of "Linux Containers"
(→/etc/lxc/funtoo/config) |
|||
| (37 intermediate revisions by 5 users not shown) | |||
| Line 1: | Line 1: | ||
Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server. | Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server. | ||
| + | |||
| + | == Status == | ||
| + | |||
| + | As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see [[OpenVZ]]. | ||
| + | |||
| + | LXC containers don't yet have their own system uptime, and they see everything that's in the host's <tt>dmesg</tt> output, among other things. But in general, the technology works. | ||
== Configuring the Funtoo Host System == | == Configuring the Funtoo Host System == | ||
=== Install LXC kernel === | === Install LXC kernel === | ||
| + | |||
| + | I am using vanilla-sources-3.1.5 with no initrd. | ||
=== Emerge lxc === | === Emerge lxc === | ||
| Line 20: | Line 28: | ||
== Setting up a Funtoo Linux LXC Container == | == Setting up a Funtoo Linux LXC Container == | ||
| − | Here are the steps required to get Funtoo Linux running <i>inside</i> a container. | + | Here are the steps required to get Funtoo Linux running <i>inside</i> a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use [[Metro]] to build an lxc container tarball directly, which will save you manual configuration steps and will provide an <tt>/etc/fstab.lxc</tt> file that you can use for your host container config. See [[Metro Recipes]] for info on how to use Metro to generate an lxc container. |
=== Create and Configure Container Filesystem === | === Create and Configure Container Filesystem === | ||
| Line 28: | Line 36: | ||
# Create an empty <tt>/lxc/funtoo/etc/fstab</tt> file. | # Create an empty <tt>/lxc/funtoo/etc/fstab</tt> file. | ||
# Ensure <tt>c1</tt> line is uncommented (enabled) and <tt>c2</tt> through <tt>c6</tt> lines are disabled in <tt>/lxc/funtoo/etc/inittab</tt>. | # Ensure <tt>c1</tt> line is uncommented (enabled) and <tt>c2</tt> through <tt>c6</tt> lines are disabled in <tt>/lxc/funtoo/etc/inittab</tt>. | ||
| − | # Edit <tt>udev-mount</tt>, <tt>udev-postmount</tt> and <tt>udev-save</tt> and change the <tt>keyword</tt> line to have the arguments <tt>-openvz -vserver -lxc</tt>. | + | # Edit <tt>udev-mount</tt>, <tt>udev-postmount</tt> and <tt>udev-save</tt> and change the <tt>keyword</tt> line to have the arguments <tt>-openvz -vserver -lxc</tt>. (will be fixed in about a week) |
That's all you need to get the container filesystem ready to start. | That's all you need to get the container filesystem ready to start. | ||
| Line 38: | Line 46: | ||
==== <tt>/etc/lxc/funtoo/config</tt> ==== | ==== <tt>/etc/lxc/funtoo/config</tt> ==== | ||
| + | {{fancynote|Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies.}} | ||
| + | |||
| + | Read "man 5 lxc.conf" , to get more information about linux container configuration file. | ||
<pre> | <pre> | ||
lxc.utsname = funtoo | lxc.utsname = funtoo | ||
| + | lxc.arch = x86_64 | ||
| + | |||
| + | # mount configuration | ||
| + | lxc.mount = /etc/lxc/funtoo/fstab | ||
| + | lxc.rootfs = /lxc/funtoo | ||
| + | |||
| + | # network configuration | ||
lxc.network.type = veth | lxc.network.type = veth | ||
lxc.network.flags = up | lxc.network.flags = up | ||
| Line 46: | Line 64: | ||
lxc.network.hwaddr = <your randomly-generated MAC address here, like a2:97:b6:df:df:28> | lxc.network.hwaddr = <your randomly-generated MAC address here, like a2:97:b6:df:df:28> | ||
lxc.network.name = eth0 | lxc.network.name = eth0 | ||
| − | lxc. | + | |
| − | lxc. | + | # CPU & Memory Limits |
| + | # kernel/Documentation/cgroups/cpusets.txt # cores 0,1 of your CPU | ||
| + | lxc.cgroup.cpuset.cpus = 0,1 | ||
| + | lxc.cgroup.cpu.shares = 1024 | ||
| + | # kernel/Documentation/cgroups/memory.txt | ||
| + | lxc.cgroup.memory.limit_in_bytes = 1024M | ||
| + | lxc.cgroup.memory.memsw.limit_in_bytes = 2048M | ||
| + | |||
| + | # TTY configuration | ||
lxc.tty = 12 | lxc.tty = 12 | ||
lxc.pts = 128 | lxc.pts = 128 | ||
| − | # restrict capabilities | + | |
| + | # Device configuration: | ||
| + | # Deny access to all devices: | ||
| + | lxc.cgroup.devices.deny = a | ||
| + | # Allow only the following devices to be opened: | ||
| + | lxc.cgroup.devices.allow = c 1:3 rwm # dev/null | ||
| + | lxc.cgroup.devices.allow = c 1:5 rwm # dev/zero | ||
| + | lxc.cgroup.devices.allow = c 1:8 rwm # dev/random | ||
| + | lxc.cgroup.devices.allow = c 1:9 rwm # dev/urandom | ||
| + | lxc.cgroup.devices.allow = c 5:0 rwm # /dev/tty - allows ssh-add/password input | ||
| + | lxc.cgroup.devices.allow = c 5:1 rwm # /dev/console - allows lxc-start output | ||
| + | lxc.cgroup.devices.allow = c 254:0 rwm # rtc | ||
| + | |||
| + | # TTYs - we create only 3 TTYs: tty0, tty1, tty2 - you can create up to 12 (see lxc.tty = 12) | ||
| + | lxc.cgroup.devices.allow = c 4:0 rwm # /dev/tty0 | ||
| + | lxc.cgroup.devices.allow = c 4:1 rwm # /dev/tty1 | ||
| + | lxc.cgroup.devices.allow = c 4:2 rwm # /dev/tty2 | ||
| + | |||
| + | |||
| + | # pts namespaces | ||
| + | lxc.cgroup.devices.allow = c 136:* rwm # dev/pts/* | ||
| + | lxc.cgroup.devices.allow = c 5:2 rwm # dev/pts/ptmx | ||
| + | |||
| + | # restrict capabilities: | ||
lxc.cap.drop = audit_control | lxc.cap.drop = audit_control | ||
lxc.cap.drop = audit_write | lxc.cap.drop = audit_write | ||
lxc.cap.drop = mac_admin | lxc.cap.drop = mac_admin | ||
lxc.cap.drop = mac_override | lxc.cap.drop = mac_override | ||
| − | |||
lxc.cap.drop = setpcap | lxc.cap.drop = setpcap | ||
lxc.cap.drop = sys_admin | lxc.cap.drop = sys_admin | ||
| Line 61: | Line 109: | ||
lxc.cap.drop = sys_module | lxc.cap.drop = sys_module | ||
lxc.cap.drop = sys_rawio | lxc.cap.drop = sys_rawio | ||
| + | lxc.cap.drop = sys_time | ||
| + | # By default, don't use lxc.cap.drop = mknod. This will allow mknod to create | ||
| + | # device nodes so build scripts and other things don't fail. Then, we'll | ||
| + | # rely on the devices.deny settings (default deny) to prevent any created | ||
| + | # device nodes inside the container from being used to access the host's | ||
| + | # hardware: | ||
| + | # lxc.cap.drop = mknod | ||
</pre> | </pre> | ||
| + | Read "man 7 capabilities" to get more information aboout Linux capabilities. | ||
Above, use the following command to generate a random MAC for <tt>lxc.network.hwaddr</tt>: | Above, use the following command to generate a random MAC for <tt>lxc.network.hwaddr</tt>: | ||
| Line 70: | Line 126: | ||
It is a very good idea to assign a static MAC address to your container using <tt>lxc.network.hwaddr</tt>. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant. | It is a very good idea to assign a static MAC address to your container using <tt>lxc.network.hwaddr</tt>. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant. | ||
| + | |||
| + | It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as <tt>/etc/lxc/hwaddr.sh</tt>, make it executable and run it like <tt>/etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx</tt> where xxx.xxx.xxx.xxx represents your Container IP. | ||
| + | |||
| + | <pre> | ||
| + | #!/bin/sh | ||
| + | IP=$* | ||
| + | HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }` | ||
| + | echo $HA | ||
| + | </pre> | ||
==== <tt>/etc/lxc/funtoo/fstab</tt> ==== | ==== <tt>/etc/lxc/funtoo/fstab</tt> ==== | ||
| Line 77: | Line 142: | ||
none /lxc/funtoo/proc proc defaults 0 0 | none /lxc/funtoo/proc proc defaults 0 0 | ||
none /lxc/funtoo/sys sysfs defaults 0 0 | none /lxc/funtoo/sys sysfs defaults 0 0 | ||
| − | none /lxc/funtoo/dev/shm tmpfs | + | none /lxc/funtoo/dev/shm tmpfs nodev,nosuid,noexec,mode=1777,rw 0 0 |
| − | none /lxc/funtoo/libexec/rc/init.d tmpfs | + | none /lxc/funtoo/libexec/rc/init.d tmpfs rw,mode=755 0 0 |
</pre> | </pre> | ||
| Line 119: | Line 184: | ||
== LXC Bugs/Missing Features == | == LXC Bugs/Missing Features == | ||
| + | |||
| + | This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions. | ||
| + | |||
| + | === reboot === | ||
| + | |||
| + | By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it. | ||
| + | |||
| + | === PID namespaces === | ||
| + | |||
| + | Process ID namespaces are functional, but the container can still see the CPU utilization of the host via the system load (ie. in <tt>top</tt>). | ||
| + | |||
| + | === /dev/pts newinstance === | ||
| + | |||
| + | * Some changes may be required to the host to properly implement "newinstance" <tt>/dev/pts</tt>. See [https://bugzilla.redhat.com/show_bug.cgi?id=501718 This Red Hat bug]. | ||
| + | |||
| + | === lxc-create and lxc-destroy === | ||
* LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy. | * LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy. | ||
| + | |||
| + | === network initialization and cleanup === | ||
| + | |||
| + | * If used network.type = phys after lxc-stop the interface will be renamed to value from lxc.network.link. It supposed to be fixed in 0.7.4, happens still on 0.7.5 - http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01760.html | ||
| + | |||
| + | * Re-starting a container can result in a failure as network resource are tied up from the already-defunct instance: [http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg00824.html] | ||
| + | |||
| + | === lxc-halt === | ||
* Missing tool to graceful shutdown container. 'lxc-halt' should be written and be posix sh-compatible, using lxc-execute to run halt in container. | * Missing tool to graceful shutdown container. 'lxc-halt' should be written and be posix sh-compatible, using lxc-execute to run halt in container. | ||
| − | * Our udev should be updated to contain <tt>-lxc</tt> in scripts. | + | === funtoo === |
| + | |||
| + | * Our udev should be updated to contain <tt>-lxc</tt> in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.) | ||
| + | * Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to <tt>/libexec/rc/init.d</tt> using the container-specific <tt>fstab</tt> file (on the host.) | ||
| + | * Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed. | ||
| + | |||
| + | == References == | ||
| + | |||
| + | * <tt>man 7 capabilities</tt> | ||
| + | * <tt>man 5 lxc.conf</tt> | ||
| + | |||
| + | == Links == | ||
| + | |||
| + | * There are a number of additional lxc features that can be enabled via patches: [http://lxc.sourceforge.net/patches/linux/3.0.0/3.0.0-lxc1/] | ||
| + | * [https://wiki.ubuntu.com/UserNamespace Ubuntu User Namespaces page] | ||
| + | * lxc-gentoo setup script [https://github.com/globalcitizen/lxc-gentoo on GitHub] | ||
| + | |||
| + | * '''IBM developerWorks''' | ||
| + | ** [http://www.ibm.com/developerworks/linux/library/l-lxc-containers/index.html LXC: Linux Container Tools] | ||
| + | ** [http://www.ibm.com/developerworks/linux/library/l-lxc-security/ Secure Linux Containers Cookbook] | ||
| − | * | + | * '''Linux Weekly News''' |
| + | ** [http://lwn.net/Articles/244531/ Smack for simplified access control] | ||
| − | |||
[[Category:Labs]] | [[Category:Labs]] | ||
[[Category:HOWTO]] | [[Category:HOWTO]] | ||
[[Category:Virtualization]] | [[Category:Virtualization]] | ||
Revision as of 10:42, 14 August 2012
Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server.
Contents |
Status
As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see OpenVZ.
LXC containers don't yet have their own system uptime, and they see everything that's in the host's dmesg output, among other things. But in general, the technology works.
Configuring the Funtoo Host System
Install LXC kernel
I am using vanilla-sources-3.1.5 with no initrd.
Emerge lxc
Configure Networking For Container
Typically, one uses a bridge to allow containers to connect to the network. This is how to do it under Funtoo Linux:
- create a bridge using the Funtoo network configuration scripts. Name the bridge something like brwan (using /etc/init.d/netif.brwan). Configure your bridge to have an IP address.
- Make your physical interface, such as eth0, an interface with no IP address (use the Funtoo interface-noip template.)
- Make netif.eth0 a slave of netif.brwan in /etc/conf.d/netif.brwan.
- Enable your new bridged network and make sure it is functioning properly on the host.
You will now be able to configure LXC to automatically add your container's virtual ethernet interface to the bridge when it starts, which will connect it to your network.
Setting up a Funtoo Linux LXC Container
Here are the steps required to get Funtoo Linux running inside a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use Metro to build an lxc container tarball directly, which will save you manual configuration steps and will provide an /etc/fstab.lxc file that you can use for your host container config. See Metro Recipes for info on how to use Metro to generate an lxc container.
Create and Configure Container Filesystem
- Start with a Funtoo OpenVZ template, and unpack it to a directory such as /lxc/funtoo.
- Edit /lxc/funtoo/etc/rc.conf and change rc_sys=openvz to rc_sys=lxc.
- Create an empty /lxc/funtoo/etc/fstab file.
- Ensure c1 line is uncommented (enabled) and c2 through c6 lines are disabled in /lxc/funtoo/etc/inittab.
- Edit udev-mount, udev-postmount and udev-save and change the keyword line to have the arguments -openvz -vserver -lxc. (will be fixed in about a week)
That's all you need to get the container filesystem ready to start.
Create Container Configuration Files
Create the following files:
/etc/lxc/funtoo/config
Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies.
Read "man 5 lxc.conf" , to get more information about linux container configuration file.
lxc.utsname = funtoo lxc.arch = x86_64 # mount configuration lxc.mount = /etc/lxc/funtoo/fstab lxc.rootfs = /lxc/funtoo # network configuration lxc.network.type = veth lxc.network.flags = up lxc.network.link = brwan lxc.network.ipv4 = <your IPv4 address here, like 1.2.3.4/29> lxc.network.hwaddr = <your randomly-generated MAC address here, like a2:97:b6:df:df:28> lxc.network.name = eth0 # CPU & Memory Limits # kernel/Documentation/cgroups/cpusets.txt # cores 0,1 of your CPU lxc.cgroup.cpuset.cpus = 0,1 lxc.cgroup.cpu.shares = 1024 # kernel/Documentation/cgroups/memory.txt lxc.cgroup.memory.limit_in_bytes = 1024M lxc.cgroup.memory.memsw.limit_in_bytes = 2048M # TTY configuration lxc.tty = 12 lxc.pts = 128 # Device configuration: # Deny access to all devices: lxc.cgroup.devices.deny = a # Allow only the following devices to be opened: lxc.cgroup.devices.allow = c 1:3 rwm # dev/null lxc.cgroup.devices.allow = c 1:5 rwm # dev/zero lxc.cgroup.devices.allow = c 1:8 rwm # dev/random lxc.cgroup.devices.allow = c 1:9 rwm # dev/urandom lxc.cgroup.devices.allow = c 5:0 rwm # /dev/tty - allows ssh-add/password input lxc.cgroup.devices.allow = c 5:1 rwm # /dev/console - allows lxc-start output lxc.cgroup.devices.allow = c 254:0 rwm # rtc # TTYs - we create only 3 TTYs: tty0, tty1, tty2 - you can create up to 12 (see lxc.tty = 12) lxc.cgroup.devices.allow = c 4:0 rwm # /dev/tty0 lxc.cgroup.devices.allow = c 4:1 rwm # /dev/tty1 lxc.cgroup.devices.allow = c 4:2 rwm # /dev/tty2 # pts namespaces lxc.cgroup.devices.allow = c 136:* rwm # dev/pts/* lxc.cgroup.devices.allow = c 5:2 rwm # dev/pts/ptmx # restrict capabilities: lxc.cap.drop = audit_control lxc.cap.drop = audit_write lxc.cap.drop = mac_admin lxc.cap.drop = mac_override lxc.cap.drop = setpcap lxc.cap.drop = sys_admin lxc.cap.drop = sys_boot lxc.cap.drop = sys_module lxc.cap.drop = sys_rawio lxc.cap.drop = sys_time # By default, don't use lxc.cap.drop = mknod. This will allow mknod to create # device nodes so build scripts and other things don't fail. Then, we'll # rely on the devices.deny settings (default deny) to prevent any created # device nodes inside the container from being used to access the host's # hardware: # lxc.cap.drop = mknod
Read "man 7 capabilities" to get more information aboout Linux capabilities.
Above, use the following command to generate a random MAC for lxc.network.hwaddr:
# openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//'
It is a very good idea to assign a static MAC address to your container using lxc.network.hwaddr. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant.
It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as /etc/lxc/hwaddr.sh, make it executable and run it like /etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx where xxx.xxx.xxx.xxx represents your Container IP.
#!/bin/sh
IP=$*
HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }`
echo $HA
/etc/lxc/funtoo/fstab
none /lxc/funtoo/dev/pts devpts defaults 0 0 none /lxc/funtoo/proc proc defaults 0 0 none /lxc/funtoo/sys sysfs defaults 0 0 none /lxc/funtoo/dev/shm tmpfs nodev,nosuid,noexec,mode=1777,rw 0 0 none /lxc/funtoo/libexec/rc/init.d tmpfs rw,mode=755 0 0
Initializing and Starting the Container
You will probably need to set the root password for the container before you can log in. You can use chroot to do this quickly:
# chroot /lxc/funtoo (chroot) # passwd New password: XXXXXXXX Retype new password: XXXXXXXX passwd: password updated successfully # exit
Now that the root password is set, run:
# lxc-start -n funtoo -d
The -d option will cause it to run in the background.
To attach to the console:
# lxc-console -n funtoo
You should now be able to log in and use the container. In addition, the container should now be accessible on the network.
To stop the container:
# lxc-stop -n funtoo
Ensure that networking is working from within the container while it is running, and you're good to go!
LXC Bugs/Missing Features
This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions.
reboot
By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it.
PID namespaces
Process ID namespaces are functional, but the container can still see the CPU utilization of the host via the system load (ie. in top).
/dev/pts newinstance
- Some changes may be required to the host to properly implement "newinstance" /dev/pts. See This Red Hat bug.
lxc-create and lxc-destroy
- LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy.
network initialization and cleanup
- If used network.type = phys after lxc-stop the interface will be renamed to value from lxc.network.link. It supposed to be fixed in 0.7.4, happens still on 0.7.5 - http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01760.html
- Re-starting a container can result in a failure as network resource are tied up from the already-defunct instance: [1]
lxc-halt
- Missing tool to graceful shutdown container. 'lxc-halt' should be written and be posix sh-compatible, using lxc-execute to run halt in container.
funtoo
- Our udev should be updated to contain -lxc in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.)
- Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to /libexec/rc/init.d using the container-specific fstab file (on the host.)
- Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed.
References
- man 7 capabilities
- man 5 lxc.conf
Links
- There are a number of additional lxc features that can be enabled via patches: [2]
- Ubuntu User Namespaces page
- lxc-gentoo setup script on GitHub
- IBM developerWorks
- Linux Weekly News