Difference between pages "Funtoo Filesystem Guide, Part 1" and "Install/Stage3"

(Difference between pages)
 
(Installing the Stage 3 tarball)
 
Line 1: Line 1:
{{Article
+
<noinclude>
|Author=Drobbins
+
{{InstallPart|the process of installing the Stage3 tarball}}
|Next in Series=Funtoo Filesystem Guide, Part 2
+
</noinclude>
}}
+
==== Setting the Date ====
== Journaling and ReiserFS ==
+
  
=== What's in Store ===
+
{{fancyimportant|If your system's date and time are too far off (typically by months or years,) then it may prevent Portage from properly downloading source tarballs. This is because some of our sources are downloaded via HTTPS, which use SSL certificates and are marked with an activation and expiration date. However, if you system time is relatively close to correct, you can probably skip this step for now.}}
The purpose of this series is to give you a solid, practical introduction to Linux's various new filesystems, including ReiserFS, XFS, JFS, GFS, ext3 and others. I want to equip you with the necessary practical knowledge you need to actually start using these filesystems. My goal is to help you avoid as many potential pitfalls as possible; this means that we're going to take a careful look at filesystem stability, performance issues (both good and bad), any negative application interactions that you should be aware of, the best kernel/patch combinations, and more. Consider this series an "insider's guide" to these next-generation filesystems.
+
  
So, that's what's in store. But to begin this series, I'm going to diverge from this plan for just one article and prepare you for the journey ahead. I'll cover two topics very important to the Linux development community -- journaling, and the design vision behind ReiserFS. Journaling is very important because it's a technology that we've been anticipating for a long time, and it's finally here. It's used in ReiserFS, XFS, JFS, ext3 and GFS. It's important to understand exactly what journaling does and why Linux needs it. Even if you have a good grasp of journaling, I hope that my journaling intro will serve as a good model for explaining the technology to others, something that'll be common practice as departments and organizations worldwide begin transitioning to these new journaling filesystems. Often, this process begins with a "Linux guy/gal" such as yourself convincing others that it's the right thing to do.
+
Now is a good time to verify the date and time are correctly set to UTC. Use the <code>date</code> command to verify the date and time:
  
In the second half of this article, we're going to take a look at the design vision behind ReiserFS. By doing so, we're going to get a good grasp on the fact that these new filesystems aren't just about doing the same old thing a bit faster. They also allow us to do things in ways that simply weren't possible before. Developers, keep this in mind as you read this series. The capabilities of these new filesystems will likely affect how you code your future Linux software development projects.
+
<console>
 +
# ##i##date
 +
Fri Jul 15 19:47:18 UTC 2011
 +
</console>
  
=== Understanding Journaling: Meta-data ===
+
If the date and/or time need to be corrected, do so using <code>date MMDDhhmmYYYY</code>, keeping in mind <code>hhmm</code> are in 24-hour format. The example below changes the date and time to "July 16th, 2011 @ 8:00PM" UTC:
As you well know, filesystems exist to allow you to store, retrieve and manipulate data. And, in order to do this, a filesystem needs to maintain an internal data structure that keeps all your data organized and readily accessible. This internal data structure (literally, "the data about the data") is called meta-data. It is the structure of this meta-data that gives a filesystem its particular identity and performance characteristics.
+
  
Normally, we don't interact with a filesystem's meta-data directly. Instead, a specific Linux filesystem driver takes care of that job for us. A Linux filesystem driver is specially written to manipulate this maze of meta-data. However, in order for the filesystem driver to work properly, it has one important requirement; it expects to find the meta-data in some kind of reasonable, consistent, non-corrupted state. Otherwise, the filesystem driver won't be able to understand or manipulate the meta-data, and you won't be able to access your files.
+
<console>
 +
# ##i##date 071620002011
 +
Fri Jul 16 20:00:00 UTC 2011
 +
</console>
  
=== Understanding Journaling: fsck ===
+
Once you have set the system clock, it's a very good idea to copy the time to the hardware clock, so it persists across reboots:
This is where <span style="color:green">fsck</span> comes in. When a Linux system boots, <span style="color:green">fsck</span> starts up and scans all local filesystems listed in the system's '''/etc/fstab''' file. <span style="color:green">fsck</span>'s job is to ensure that the to-be-mounted filesystems' meta-data is in a usable state. Most of the time, it is. When Linux shuts down, it carefully flushes all cached data to disk and ensures that the filesystem is cleanly unmounted, so that it's ready for use when the system starts up again. Typically, <span style="color:green">fsck</span> scans the to-be-mounted filesystems and finds that they were cleanly unmounted, and makes the reasonable assumption that all meta-data is OK.
+
  
However, we all know that every now and then, something atypical happens, such as an unexpected power failure or system lock-up. When these unfortunate situations occur, Linux doesn't have the opportunity to cleanly unmount the filesystem. When the system is rebooted and <span style="color:green">fsck</span> starts its scan, it detects that these filesystems were not cleanly unmounted and makes a reasonable assumption that the filesystems probably aren't ready to be seen by the Linux filesystem drivers. It's very likely that the meta-data is messed up in some way.
+
<console>
 +
# ##i##hwclock --systohc
 +
</console>
  
So, to fix this situation, <span style="color:green">fsck</span> will begin an exhaustive scan and sanity check on the meta-data, correcting any errors that it finds along the way. Once fsck is complete, the filesystem is ready for use. Although some recently-modified data may have been lost due to the unexpected power failure or system lockup, since the meta-data is now consistent, the filesystem is ready to be mounted and be put to use.
+
=== Installing the Stage 3 tarball ===
  
=== The Problem With fsck ===
+
Now that filesystems are created and your hardware and system clock are set, the next step is downloading the initial Stage 3 tarball. The Stage 3 is a pre-compiled system used as a starting point to install Funtoo Linux.  
So far, this may not sound like a bad approach to ensuring filesystem consistency, but the solution isn't optimal. Problems arise from the fact that <span style="color:green">fsck</span> must scan a filesystem's entire meta-data in order to ensure filesystem consistency. Doing a complete consistency check on all meta-data is a time-consuming task in itself, normally taking at least several minutes to complete. Even worse, the bigger the filesystem, the longer this exhaustive scan takes. This is a big problem, because while <span style="color:green">fsck</span> is doing its thing, your Linux system is effectively offline, and if you have a large amount of filesystem storage, your system could be <span style="color:green">fsck</span>-ing for half an hour or more. Of course, standard <span style="color:green">fsck</span> behavior can have devastating results in mission-critical datacenter environments where system uptime is extremely important. Fortunately, there's a better solution.
+
  
=== The Journal ===
+
To download the correct build of Funtoo Linux for your system, head over to the [[Subarches]] page. Subarches are builds of Funtoo Linux that are designed to run on a particular type of CPU, to offer the best possible performance. They also take advantage of the instruction sets available for each CPU.
Journaling filesystems solve this <span style="color:green">fsck</span> problem by adding a new data structure, called a journal, to the mix. This journal is an on-disk structure. Before the filesystem driver makes any changes to the meta-data, it writes an entry to the journal that describes what it's about to do. Then, it goes ahead and modifies the meta-data. By doing so, a journaling filesystem maintains a log of recent meta-data modifications, and this comes in handy when it comes time to check the consistency of a filesystem that wasn't cleanly unmounted.
+
  
Think of journaling filesystems this way -- in addition to storing data (your stuff) and meta-data (the data about the stuff), they also have a journal, which you could call meta-meta-data (the data about the data about the stuff).
+
The [[Subarches]] page lists all CPU-optimized versions of Funtoo Linux. Find the one that is appropriate for the type of CPU that your system has, and then click on its name in the first column (such as <code>corei7</code>, for example.) You will then go to a page dedicated to that subarch, and the available stage3's available for download will be listed.
  
=== Journaling in Action ===
+
For most subarches, you will have several stage3's available to choose from. This next section will help you understand which one to pick.
So, what does <span style="color:green">fsck</span> do with a journaling filesystem? Actually, normally, it does nothing. It simply ignores the filesystem and allows it to be mounted. The real magic behind quickly restoring the filesystem to a consistent state is found in the Linux filesystem driver. When the filesystem is mounted, the Linux filesystem driver checks to see whether the filesystem is OK. If for some reason it isn't, then the meta-data needs to be fixed, but instead of performing an exhaustive meta-data scan (like <span style="color:green">fsck</span>) it instead takes a look at the journal. Since the journal contains a chronological log of all recent meta-data changes, it simply inspects those portions of the meta-data that have been recently modified. Thus, it is able to bring the filesystem back to a consistent state in a matter of seconds. And unlike the more traditional approach that <span style="color:green">fsck</span> takes, this journal replaying process does not take longer on larger filesystems. Thanks to the journal, hundreds of Gigabytes of filesystem meta-data can be brought to a consistent state almost instantaneously.
+
  
=== ReiserFS ===
+
==== Which Build? ====
Now, we come to ReiserFS, the first of several journaling filesystems we're going to be investigating. ReiserFS 3.6.x (the version included as part of Linux 2.4+) is designed and developed by Hans Reiser and his team of developers at Namesys. Hans and his team share the philosophy that the best filesystems are those that help create a single shared environment, or namespace, where applications can interact more directly, efficiently and powerfully. To do this, a filesystem should meet the performance and feature needs of its users. That way, users can continue using the filesystem directly rather than building special-purpose layers that run on top of the filesystem, such as databases and the like.
+
  
=== Small File Performance ===
+
'''If you're not sure, pick <code>funtoo-current</code>.'''
So, how does one go about making the filesystem more accommodating? Namesys has decided to focus on one aspect of the filesystem, at least initially -- small file performance. In general, filesystems like ext2 and ufs don't do very well in this area, often forcing developers to turn to databases or special organizational hacks to get the kind of performance they need. Over time, this kind of "I'll code around the problem" approach encourages code bloat and lots of incompatible special-purpose APIs, which isn't a good thing.
+
  
Here's an example of how ext2 can tend to encourage this kind of programming. ext2 is good at storing lots of twenty-plus k files, but isn't an ideal technology for storing 2,000 50-byte files. Not only does performance drop significantly when ext2 has to deal with extremely small files, but storage efficiency drops as well, since ext2 allocates space in either one or four k chunks (configurable when the filesystem is created).
+
Funtoo Linux has various different 'builds':
  
Now, conventional wisdom would say that you aren't supposed to store that many ridiculously small files on a filesystem. Instead, they should be stored in some kind of database that runs above the filesystem. In reply, Hans Reiser would point out that whenever you need to build a layer on top of the filesystem, it means that the filesystem isn't meeting your needs. If the filesystem met your needs, then you could avoid using a special-purpose solution in the first place. You would thus save development time and eliminate the code bloat that you would have created by hand-rolling your own proprietary storage or caching mechanism, interfacing with a database library, etc.
+
{{TableStart}}
 +
<tr><th class="info">Build</th><th class="info">Description</th></tr>
 +
<tr><td><code>funtoo-current</code></td><td>The most commonly-selected build of Funtoo Linux. Receives rapid updates and preferred by desktop users.</td></tr>
 +
<tr><td><code>funtoo-stable</code></td><td>Emphasizes less-frequent package updates and trusted, reliable versions of packages over the latest versions.</td></tr>
 +
{{TableEnd}}
  
Well, that's the theory. But how good is ReiserFS' small file performance in practice? Amazingly good. In fact, ReiserFS is around eight to fifteen times faster than ext2 when handling files smaller than one k in size! Even better, these performance improvements don't come at the expense of performance for other file types. In general, ReiserFS outperforms ext2 in nearly every area, but really shines when it comes to handling small files.
+
If you want to read more about this, have a look at [[Funtoo_Linux#What_are_the_differences_between_.27stable.27.2C_.27current.27_and_.27experimental.27_.3F|Differences between stable, current and experimental]].
  
=== ReiserFS Technology ===
+
==== Which Variant? ====
So how does ReiserFS go about offering such excellent small file performance? ReiserFS uses a specially optimized b* balanced tree (one per filesystem) to organize all filesystem data. This in itself offers a nice performance boost, as well as easing artificial restrictions on filesystem layouts. It's now possible to have a directory that contains 100,000 other directories, for example. Another benefit of using a b*tree is that ReiserFS, like most other next-generation filesystems, dynamically allocates inodes as needed rather than creating a fixed set of inodes at filesystem creation time. This helps the filesystem to be more flexible to the various storage requirements that may be thrown at it, while at the same time allowing for some additional space-efficiency.
+
  
ReiserFS also has a host of features aimed specifically at improving small file performance. Unlike ext2, ReiserFS doesn't allocate storage space in fixed one k or four k blocks. Instead, it can allocate the exact size it needs. And ReiserFS also includes some special optimizations centered around tails, a name for files and end portions of files that are smaller than a filesystem block. In order to increase performance, ReiserFS is able to store files inside the b*tree leaf nodes themselves, rather than storing the data somewhere else on the disk and pointing to it.
+
'''If you're not sure, pick <code>(None)</code>.'''
  
This does two things. First, it dramatically increases small file performance. Since the file data and the stat_data (inode) information are stored right next to each other, they can normally be read with a single disk IO operation. Second, ReiserFS is able to pack the tails together, saving a lot of space. In fact, a ReiserFS filesystem with tail packing enabled (the default) can store six percent more data than the equivalent ext2 filesystem, which is amazing in itself.
+
Besides our "regular" stage3's (listed with a variant of <code>(None)</code>, the following variant builds are available:
  
However, tail packing does cause a slight performance hit since it forces ReiserFS to repack data as files are modified. For this reason, ReiserFS tail packing can be turned off, allowing the administrator to choose between good speed and space efficiency, or opt for even more speed at the cost of some storage capacity.
+
{{TableStart}}
 +
<tr><th class="info">Variant</th><th class="info">Description</th></tr>
 +
<tr><td>(None)</td><td>The "standard" version of Funtoo Linux</td></tr>
 +
<tr><td><code>pure64</code></td><td>A 64-bit build that drops multilib (32-bit compatibility) support. Can be ideal for server systems.</td></tr>
 +
<tr><td><code>hardened</code></td><td>Includes PIE/SSP toolchain for enhanced security. PIE does require the use of PaX in the kernel, while SSP works with any kernel, and provides enhanced security in user-space to avoid stack-based exploits.</td></tr>
 +
{{TableEnd}}
  
ReiserFS truly is an excellent filesystem. In my next article, I'll guide you through the process of setting up ReiserFS under Linux. We'll also take a close look at performance tuning, application interactions (and how to work around them), the best kernels to use, and more.
+
==== Download the Stage3 ====
  
== Resources ==
+
Once you have found the stage3 that you would like to download, use <code>wget</code> to download the Stage 3 tarball you have chosen to use as the basis for your new Funtoo Linux system. It should be saved to the <code>/mnt/funtoo</code> directory as follows:
Be sure to checkout the other articles in this series:
+
* [[Funtoo Filesystem Guide, Part 1|Part 1]]: Journaling and ReiserFS
+
* [[Funtoo Filesystem Guide, Part 2|Part 2]]: Using ReiserFS and Linux
+
* [[Funtoo Filesystem Guide, Part 3|Part 3]]: Tmpfs and bind mounts
+
* [[Funtoo Filesystem Guide, Part 4|Part 4]]: Introducing Ext3
+
* [[Funtoo Filesystem Guide, Part 5|Part 5]]: Ext3 in action
+
  
[[Category:Filesystem Guides]]
+
<console># ##i##cd /mnt/funtoo
[[Category:Articles]]
+
# ##i##wget http://build.funtoo.org/funtoo-current/x86-64bit/generic_64/stage3-latest.tar.xz
{{ArticleFooter}}
+
</console>
 +
 
 +
Note that 64-bit systems can run 32-bit or 64-bit stages, but 32-bit systems can only run 32-bit stages. Make sure that you select a Stage 3 build that is appropriate for your CPU. If you are not certain, it is a safe bet to choose the <code>generic_64</code> or <code>generic_32</code> stage. Consult the [[Subarches]] page for more information.
 +
 
 +
Once the stage is downloaded, extract the contents with the following command, substituting in the actual name of your stage 3 tarball:
 +
<console>
 +
# ##i##tar xpf stage3-latest.tar.xz
 +
</console>
 +
 
 +
{{important|It is very important to use <code>tar's</code> "<code>'''p'''</code>" option when extracting the Stage 3 tarball - it tells <code>tar</code> to ''preserve'' any permissions and ownership that exist within the archive. Without this option, your Funtoo Linux filesystem permissions will be incorrect.}}

Revision as of 07:01, December 29, 2014


Note

This is a template that is used as part of the Installation instructions which covers: the process of installing the Stage3 tarball. Templates are being used to allow multiple variant install guides that use most of the same re-usable parts.


Setting the Date

Important

If your system's date and time are too far off (typically by months or years,) then it may prevent Portage from properly downloading source tarballs. This is because some of our sources are downloaded via HTTPS, which use SSL certificates and are marked with an activation and expiration date. However, if you system time is relatively close to correct, you can probably skip this step for now.

Now is a good time to verify the date and time are correctly set to UTC. Use the date command to verify the date and time:

# date
Fri Jul 15 19:47:18 UTC 2011

If the date and/or time need to be corrected, do so using date MMDDhhmmYYYY, keeping in mind hhmm are in 24-hour format. The example below changes the date and time to "July 16th, 2011 @ 8:00PM" UTC:

# date 071620002011
Fri Jul 16 20:00:00 UTC 2011

Once you have set the system clock, it's a very good idea to copy the time to the hardware clock, so it persists across reboots:

# hwclock --systohc

Installing the Stage 3 tarball

Now that filesystems are created and your hardware and system clock are set, the next step is downloading the initial Stage 3 tarball. The Stage 3 is a pre-compiled system used as a starting point to install Funtoo Linux.

To download the correct build of Funtoo Linux for your system, head over to the Subarches page. Subarches are builds of Funtoo Linux that are designed to run on a particular type of CPU, to offer the best possible performance. They also take advantage of the instruction sets available for each CPU.

The Subarches page lists all CPU-optimized versions of Funtoo Linux. Find the one that is appropriate for the type of CPU that your system has, and then click on its name in the first column (such as corei7, for example.) You will then go to a page dedicated to that subarch, and the available stage3's available for download will be listed.

For most subarches, you will have several stage3's available to choose from. This next section will help you understand which one to pick.

Which Build?

If you're not sure, pick funtoo-current.

Funtoo Linux has various different 'builds':

BuildDescription
funtoo-currentThe most commonly-selected build of Funtoo Linux. Receives rapid updates and preferred by desktop users.
funtoo-stableEmphasizes less-frequent package updates and trusted, reliable versions of packages over the latest versions.

If you want to read more about this, have a look at Differences between stable, current and experimental.

Which Variant?

If you're not sure, pick (None).

Besides our "regular" stage3's (listed with a variant of (None), the following variant builds are available:

VariantDescription
(None)The "standard" version of Funtoo Linux
pure64A 64-bit build that drops multilib (32-bit compatibility) support. Can be ideal for server systems.
hardenedIncludes PIE/SSP toolchain for enhanced security. PIE does require the use of PaX in the kernel, while SSP works with any kernel, and provides enhanced security in user-space to avoid stack-based exploits.

Download the Stage3

Once you have found the stage3 that you would like to download, use wget to download the Stage 3 tarball you have chosen to use as the basis for your new Funtoo Linux system. It should be saved to the /mnt/funtoo directory as follows:

# cd /mnt/funtoo
# wget http://build.funtoo.org/funtoo-current/x86-64bit/generic_64/stage3-latest.tar.xz

Note that 64-bit systems can run 32-bit or 64-bit stages, but 32-bit systems can only run 32-bit stages. Make sure that you select a Stage 3 build that is appropriate for your CPU. If you are not certain, it is a safe bet to choose the generic_64 or generic_32 stage. Consult the Subarches page for more information.

Once the stage is downloaded, extract the contents with the following command, substituting in the actual name of your stage 3 tarball:

# tar xpf stage3-latest.tar.xz
Important

It is very important to use tar's "p" option when extracting the Stage 3 tarball - it tells tar to preserve any permissions and ownership that exist within the archive. Without this option, your Funtoo Linux filesystem permissions will be incorrect.