Difference between pages "Funtoo Filesystem Guide, Part 3" and "FLOP:Ports-2015"

From Funtoo
(Difference between pages)
Jump to navigation Jump to search
 
 
Line 1: Line 1:
{{Article
{{FLOP
|Subtitle=Tmpfs and Bind Mounts
|Created on=2015/02/23
|Author=Drobbins
|Summary=Collection of ideas and changes for the ports-2015 tree. The goal is to perform many scheduled changes with a single user configuration change.
|Previous in Series=Funtoo Filesystem Guide, Part 2
|Author=Mgorny,
|Next in Series=Funtoo Filesystem Guide, Part 4
|Reference Bug=FL-1877
}}
}}
== Introduction ==
== Procedure ==
In my previous articles in this series, I introduced the benefits of journaling and the ReiserFS and showed how to set up a rock-solid ReiserFS system. In this article, we're going to tackle a couple of semi-offbeat topics. First, we'll take a look at tmpfs, also known as the virtual memory (VM) filesystem. Tmpfs is probably the best RAM disk-like system available for Linux right now, and was introduced with Linux kernel 2.4. Then, we'll take a look at another capability introduced with Linux kernel 2.4 called "bind mounts", which allow a great deal of flexibilit
Users of ports-2012 tree will be informed that the current repository is deprecated, and provided with complete migration instructions. The instructions will cover both necessary and optional changes that can be done conveniently along with the necessary switch.
y when it comes to mounting (and remounting) filesystems.


== Introducing Tmpfs ==
== Changes ==
If I had to explain tmpfs in one breath, I'd say that tmpfs is like a ramdisk, but different. Like a ramdisk, tmpfs can use your RAM, but it can also use your swap devices for storage. And while a traditional ramdisk is a block device and requires a mkfs command of some kind before you can actually use it, tmpfs is a filesystem, not a block device; you just mount it, and it's there.  
=== History cut-off ===
All in all, this makes tmpfs the niftiest RAM-based filesystem I've had the opportunity to meet.
As an implication of starting a new tree, all history is cut-off. While users were cloning the repository with --depth=1, old clones have accumulated a large history of changes. This history will be discarded with the new clone. This will significatly decrease the size of portage tree and the size of portage tree compressed tarball, if someone prefer to use it. Eventually, portage tree will grow up again.


== Tmpfs and VM ==
=== Portage upgrade / repos.conf switch ===
Let's take a look at some of tmpfs's more interesting properties. As I mentioned above, tmpfs can use both RAM and swap. This might seem a bit arbitrary at first, but remember that tmpfs is also known as the "virtual memory filesystem". And, as you probably know, the Linux kernel's virtual memory resources come from both your RAM and swap devices. The VM subsystem in the kernel allocates these resources to other parts of the system and takes care of managing these resources behind-the-scenes, often transparently moving RAM pages to swap and vice-versa.
Reference: {{Bug|FL-1761}}, [[Repository Configuration]]
The tmpfs filesystem requests pages from the VM subsystem to store files. tmpfs itself doesn't know whether these pages are on swap or in RAM; it's the VM subsystem's job to make those kinds of decisions. All the tmpfs filesystem knows is that it is using some form of virtual memory.


== Not a Block Device ==
As a part of upstream Portage changes, the upgrade is accompanied with some configuration file changes. Aside them, repository configuration is moved to repos.conf and the repository name becomes significant. Merging this with ports-2015 switch allows users to update the configuration in new format already.
Here's another interesting property of the tmpfs filesystem. Unlike most "normal" filesystems, like ext3, ext2, XFS, JFS, ReiserFS and friends, tmpfs does not exist on top of an under
lying block device. Because tmpfs sits on top of VM directly, you can create a tmpfs filesystem with a simple mount command:


<pre># mount tmpfs /mnt/tmpfs -t tmpfs</pre>After executing this command, you'll have a new tmpfs filesystem mounted at /mnt/tmpfs, ready for use. Note that there's no need to run mkfs.tmpfs; in fact, it's impossible, as no such command exists. Immediately after the mount command, the filesystem is mounted and available for use, and is of type tmpfs. This is very different from how Linux ramdisks are used; standard Linux ramdisks are block devices, so they must be formatted with a filesystem of your choice before you can use them. In contrast, tmpfs is a filesystem. So, you can just mount it and go.
=== Repository rename ===
Reference: {{Bug|FL-1801}}


== Tmpfs Advantages ==
Right now, the main repository inherits the name 'gentoo'. This is a bit confusing, considering that it is a modified Funtoo variant of the package tree. Changing the name to 'funtoo' would improve consistency and carry some bit of 'branding' into packages. Merging this into ports-2015 switch allows users to consciously update all repository references if necessary, and combines the change with necessity of specifying repository name in repos.conf.


=== Dynamic Filesystem Size ===
== Other possible changes ==
You're probably wondering about how big that tmpfs filesystem was that we mounted at <tt>/mnt/tmpfs</tt>, above. The answer to that question is a bit unexpected, especially when compared to disk-based filesystems. <tt>/mnt/tmpfs</tt> will initially have a very small capacity, but as files are copied and created, the tmpfs filesystem driver will allocate more VM and will dynamically increase the filesystem capacity as needed. And, as files are removed from <tt>/mnt/tmpfs</tt>, the tmpfs filesystem driver will dynamically shrink the size of the filesystem and free VM resources, and by doing so return VM into circulation so that it can be used by other parts of the system as needed. Since VM is a precious resource, you don't want anything hogging more VM than it ac
=== Filesystem structure reorganization ===
tually needs, and the great thing about tmpfs is that this all happens automatically.
Since users will be required to clone the new repository, it may be desired to suggest some best practices for filesystem layout. This specifically includes separating ebuilds from distfiles & packages, and using a multi-repository layout. Historically, portage tree inspired by FreeBSD ports, located in <code>/usr/portage>. This FHS layout is different in BSD and in Linux and we will probably move it elsewhere, see below for a suggestion and possible pros and cons.


=== Speed ===
Example layout suggested by mgorny:
The other major benefit of tmpfs is its blazing speed. Because a typical tmpfs filesystem will reside completely in RAM, reads and writes can be almost instantaneous. Even if some swap is used, performance is still excellent and those parts of the tmpfs filesystem will be moved to RAM as more free VM resources become available. Having the VM subsystem automatically move parts of the tmpfs filesystem to swap can actually be good for performance, since by doing so, the VM subsystem can free up RAM for processes that need it. This, along with its dynamic resizing abilities, allow for much better overall OS performance and flexibility than the alternative of using a traditional RAM disk.
# all repositories in ''/var/db/repos/${repo_name}'' (i.e. Funtoo repository in ''/var/db/repos/funtoo'', and possible overlays as other directories in ''/var/db/repos''),
# distfiles in ''/var/cache/portage/distfiles'',
# binary packages in ''/var/cache/portage/packages''.


=== No Persistence ===
Users can be recommended to use a separate filesystem that can handle small files efficiently for ''/var/db/repos'', e.g. btrfs, reiserfs or possibly squashfs.


While this may not seem like a positive, tmpfs data is not preserved between reboots, because virtual memory is volatile in nature. I guess you probably figured that tmpfs was called &quot;tmpf
{{FLOPFooter}}
s&quot; for a reason, didn't you? However, this can actually be a good thing. It makes tmpfs an excellent filesystem for holding data that you don't need to keep, such as temporary files (those found in <tt>/tmp</tt>) and parts of the <tt>/var</tt> filesystem tree.
 
== Using Tmpfs ==
 
To use tmpfs, all you need is a modern (2.4+) kernel with <tt>Virtual memory file system support (former shm fs)</tt> enabled; this option lives under the <tt>File systems</tt> section of the kernel configuration options. Once you have a tmpfs-enabled kernel, you can go ahead and mount tmpfs filesystems. In fact, it's a good idea to enable tmpfs in all your kernels if you compile the
m yourself - whether you plan to use tmpfs or not. This is because you need to have kernel tmpfs support in order to use POSIX shared memory. System V shared memory will work without tmpfs in y
our kernel, however. Note that you do not need a tmpfs filesystem to be mounted for POSIX shared memory to work; you simply need the support in your kernel. POSIX shared memory isn't used too much right now, but this situation will likely change as time goes on.
 
=== Avoiding low VM conditions ===
 
The fact that tmpfs dynamically grows and shrinks as needed makes one wonder: what happens when your tmpfs filesystem grows to the point where it exhausts all of your virtual memory, and you have no RAM or swap left? Well, generally, this kind of situation is a bit ugly. With kernel 2.4.4, the kernel would immediately lock up. With more recent kernels, the VM subsystem has in many ways been fixed, and while exhausting VM isn't exactly a wonderful experience, things don't blow up completely, either. When a modern kernel gets to the point where it can't allocate any more VM, you obviously won't be unable to write any new data to your tmpfs filesystem. In addition, it's likely that some other things will happen. First, the other processes on the system will be unable to allocate much more memory; generally, this means that the system will most likely become extremely sluggish and almost unresponsive. Thus, it may be tricky or unusually time-consuming for the superuser to take the necessary steps to alleviate this low-VM condition.
 
In addition, the kernel has a built-in last-ditch system for freeing memory when no more is available; it'll find a process that's hogging VM resources and kill it. Unfortunately, this &quot;kill a process&quot; solution generally backfires when tmpfs growth is to blame for VM exhaustion. Here's the reason. Tmpfs itself can't (and shouldn't) be killed, since it is part of the kernel  and not a user process, and there's no easy way for the kernel to find out which process is filling up the tmpfs filesystem. So, the kernel mistakenly attacks the biggest VM-hog of a process it can find, which is generally your X server if you happen to be running one. So, your X server dies, and the root cause of the low-VM condition (tmpfs) isn't addressed. Ick.
 
=== Low VM: the solution ===
 
Fortunately, tmpfs allows you to specify a maximum upper bound for the filesystem size when a filesystem is mounted or remounted. Actually, as of kernel 2.4.6 and util-linux-2.11g, these parameters can only be set on mount, not on remount, but we can expect them to be settable on remount sometime in the near future. The optimal maximum tmpfs size setting depends on the resources and
usage pattern of your particular Linux box; the idea is to prevent a completely full tmpfs filesystem from exhausting all virtual memory and thus causing the ugly low-VM conditions that we talked about earlier. A good way to find a good tmpfs upper-bound is to use top to monitor your system's swap usage during peak usage periods. Then, make sure that you specify a tmpfs upper-bound that's slightly less than the sum of all free swap and free RAM during these peak usage times.
 
Creating a tmpfs filesystem with a maximum size is easy. To create a new tmpfs filesystem with a maximum filesystem size of 32 MB, type:
 
<pre># mount tmpfs /dev/shm -t tmpfs -o size=32m</pre>
 
This time, instead of mounting our new tmpfs filesystem at /mnt/tmpfs, we created it at /dev/shm, which is a directory that happens to be the &quot;official&quot; mount point for a tmpfs filesystem. If you happen to be using devfs, you'll find that this directory has already been created for you.
Also, if we want to limit the filesystem size to 512 KB or 1 GB, we can specify size=512k and size=1g, respectively. In addition to limiting size, we can also limit the number of inodes (filesystem objects) by specifying the nr_inodes=x parameter. When using nr_inodes, x can be a simple integer, and can also be followed with a k, m, or g to specify thousands, millions, or billions (!) of inodes.
 
Also, if you'd like to add the equivalent of the above mount tmpfs command to your /etc/fstab, it'd look like this:
 
<pre>tmpfs  /dev/shm    tmpfs  size=32m    0  0</pre>
 
=== Mounting On Top of Existing Mount Points ===
Back in the 2.2 days, any attempt to mount something to a mount point where something had already been mounted resulted in an error. However, thanks to a rewrite of the kernel mounting code, using mount points multiple times is not a problem. Here's an example scenario: let's say that we have an existing filesystem mounted at <tt>/tmp</tt>. However, we decide that we'd like to start
using tmpfs for <tt>/tmp</tt> storage. In the old days, your only option would be to unmount <tt>/tmp</tt> and remount your new tmpfs <tt>/tmp</tt> filesystem in its place, as follows:
 
<pre>#  umount /tmp
#  mount tmpfs /tmp -t tmpfs -o size=64m</pre>
However, this solution may not work for you. Maybe there are a number of running processes that have open files in <tt>/tmp</tt>; if so, when trying to unmount <tt>/tmp</tt>, you'd get the following error:
 
<pre>umount: /tmp: device is busy</pre>
However, with Linux 2.4+, you can mount your new <tt>/tmp</tt> filesystem without getting the &quot;device is busy&quot; error:
 
<pre># mount tmpfs /tmp -t tmpfs -o size=64m</pre>
With a single command, your new tmpfs <tt>/tmp</tt> filesystem is mounted at <tt>/tmp</tt>, on top of the already-mounted partition, which can no longer be directly accessed. However, while you can't get to the original <tt>/tmp</tt>, any processes that still have open files on this original filesystem can continue to access them. And, if you umount your tmpfs-based <tt>/tmp</tt>, your original mounted <tt>/tmp</tt> filesystem will reappear. In fact, you can mount any number of filesystems to the same mount point, and the mount point will act like a stack; unmount the current filesystem, and the last-most-recently mounted filesystem will reappear from underneath.
 
== Bind Mounts ==
 
Using bind mounts, we can mount all, or even part of an already-mounted filesystem to another location, and have the filesystem accessible from both mount points at the same time! For example, you can use bind mounts to mount your existing root filesystem to <tt>/home/drobbins/nifty</tt>, as follows:
 
<pre>#  mount --bind / /home/drobbins/nifty</pre>
Now, if you look inside <tt>/home/drobbins/nifty</tt>, you'll see your root filesystem (<tt>/home/drobbins/nifty/etc</tt>, <tt>/home/drobbins/nifty/opt</tt>, etc.). And if you modify a file on your root filesystem, you'll see the modifications in <tt>/home/drobbins/nifty</tt> as well. This is because they are one and the same filesystem; the kernel is simply mapping the filesystem to two different mount points for us. Note that when you mount a filesystem somewhere else, any filesystems that were mounted to mount points inside the bind-mounted filesystem will not be moved along. In other words, if you have <tt>/usr</tt> on a separate filesystem, the bind mount we performed above will leave <tt>/home/drobbins/nifty/usr</tt> empty. You'll need an additional bind mount command to allow you to browse the contents of <tt>/usr</tt> at <tt>/home/drobbins/nifty/usr</tt>:
 
<pre>#  mount --bind /usr /home/drobbins/nifty/usr</pre>
=== Bind mounting parts of filesystems ===
 
Bind mounting makes even more neat things possible. Let's say that you have a tmpfs filesystem mounted at <tt>/dev/shm</tt>, its traditional location, and you decide that you'd like to start using tmpfs for <tt>/tmp</tt>, which currently lives on your root filesystem. Rather than mounting a new tmpfs filesystem to <tt>/tmp</tt> (which is possible), you may decide that you'd like the new <tt>/tmp</tt> to share the currently mounted <tt>/dev/shm</tt> filesystem. However, while you could bind mount <tt>/dev/shm</tt> to <tt>/tmp</tt> and be done with it, your <tt>/dev/shm</tt> contains some directories that you don't want to appear in <tt>/tmp</tt>. So, what do you do? How about this:
 
<pre># mkdir /dev/shm/tmp
# chmod 1777 /dev/shm/tmp
# mount --bind /dev/shm/tmp /tmp</pre>
In this example, we first create a <tt>/dev/shm/tmp</tt> directory and then give it 1777 perms, the proper permissions for <tt>/tmp</tt>. Now that our directory is ready, we can mount <tt>/dev/shm/tmp</tt>, and only <tt>/dev/shm/tmp</tt> to <tt>/tmp</tt>. So, while <tt>/tmp/foo</tt> would map to <tt>/dev/shm/tmp/foo</tt>, there's no way for you to access the <tt>/dev/shm/bar</tt> file from <tt>/tmp</tt>.
 
As you can see, bind mounts are extremely powerful and make it easy to make modifications to your filesystem layout without any fuss. Next article, we'll check out ext3.
 
== Resources ==
Be sure to checkout the other articles in this series:
* [[Funtoo Filesystem Guide, Part 1|Part 1]]: Journaling and ReiserFS
* [[Funtoo Filesystem Guide, Part 2|Part 2]]: Using ReiserFS and Linux
* [[Funtoo Filesystem Guide, Part 3|Part 3]]: Tmpfs and bind mounts
* [[Funtoo Filesystem Guide, Part 4|Part 4]]: Introducing Ext3
* [[Funtoo Filesystem Guide, Part 5|Part 5]]: Ext3 in action
 
[[Category:Filesystem Guides]]
[[Category:Articles]]
{{ArticleFooter}}

Revision as of 13:50, February 23, 2015

Created on
2015/02/23
Original Author(s)
Mgorny
Status
Reference Bug
FL-1877

Funtoo Linux Optimization Proposal: Ports-2015

Collection of ideas and changes for the ports-2015 tree. The goal is to perform many scheduled changes with a single user configuration change.

Procedure

Users of ports-2012 tree will be informed that the current repository is deprecated, and provided with complete migration instructions. The instructions will cover both necessary and optional changes that can be done conveniently along with the necessary switch.

Changes

History cut-off

As an implication of starting a new tree, all history is cut-off. While users were cloning the repository with --depth=1, old clones have accumulated a large history of changes. This history will be discarded with the new clone. This will significatly decrease the size of portage tree and the size of portage tree compressed tarball, if someone prefer to use it. Eventually, portage tree will grow up again.

Portage upgrade / repos.conf switch

Reference: FL-1761, Repository Configuration

As a part of upstream Portage changes, the upgrade is accompanied with some configuration file changes. Aside them, repository configuration is moved to repos.conf and the repository name becomes significant. Merging this with ports-2015 switch allows users to update the configuration in new format already.

Repository rename

Reference: FL-1801

Right now, the main repository inherits the name 'gentoo'. This is a bit confusing, considering that it is a modified Funtoo variant of the package tree. Changing the name to 'funtoo' would improve consistency and carry some bit of 'branding' into packages. Merging this into ports-2015 switch allows users to consciously update all repository references if necessary, and combines the change with necessity of specifying repository name in repos.conf.

Other possible changes

Filesystem structure reorganization

Since users will be required to clone the new repository, it may be desired to suggest some best practices for filesystem layout. This specifically includes separating ebuilds from distfiles & packages, and using a multi-repository layout. Historically, portage tree inspired by FreeBSD ports, located in /usr/portage>. This FHS layout is different in BSD and in Linux and we will probably move it elsewhere, see below for a suggestion and possible pros and cons.

Example layout suggested by mgorny:

  1. all repositories in /var/db/repos/${repo_name} (i.e. Funtoo repository in /var/db/repos/funtoo, and possible overlays as other directories in /var/db/repos),
  2. distfiles in /var/cache/portage/distfiles,
  3. binary packages in /var/cache/portage/packages.

Users can be recommended to use a separate filesystem that can handle small files efficiently for /var/db/repos, e.g. btrfs, reiserfs or possibly squashfs.