Difference between pages "ZFS Fun" and "ZFS Install Guide"

From Funtoo
(Difference between pages)
Jump to: navigation, search
(Streaming ZFS datasets)
 
 
Line 1: Line 1:
{{Fancyimportant|This tutorial is under a heavy revision to be switched from ZFS Fuse to ZFS on Linux.}}
+
== Introduction ==
  
= Introduction =
+
This tutorial will show you how to install Funtoo on ZFS (rootfs). This tutorial is meant to be an "overlay" over the [[Funtoo_Linux_Installation|Regular Funtoo Installation]]. Follow the normal installation and only use this guide for steps 2, 3, and 8.
  
== ZFS features and limitations ==
+
=== Introduction to ZFS ===
  
ZFS offers an impressive amount of features even putting aside its hybrid nature (both a filesystem and a volume manager -- zvol) covered in detail on [http://en.wikipedia.org/wiki/ZFS Wikipedia]. One of the most fundamental points to keep in mind about ZFS is it '''targets a legendary reliability in terms of preserving data integrity'''. ZFS uses several techniques to detect and repair (self-healing) corrupted data. Simply speaking it makes an aggressive use of checksums and relies on data redundancy, the price to pay is a bit more CPU processing power. However, the [http://en.wikipedia.org/wiki/ZFS Wikipedia article about ZFS] also mention it is strongly discouraged to use ZFS over classic RAID arrays as it can not control the data redundancy, thus ruining most of its benefits.
+
Since ZFS is a new technology for Linux, it can be helpful to understand some of its benefits, particularly in comparison to BTRFS, another popular next-generation Linux filesystem:
  
In short, ZFS has the following features (not exhaustive):
+
* On Linux, the ZFS code can be updated independently of the kernel to obtain the latest fixes. btrfs is exclusive to Linux and you need to build the latest kernel sources to get the latest fixes.
  
* Storage pool dividable in one or more logical storage entities.
+
* ZFS is supported on multiple platforms. The platforms with the best support are Solaris, FreeBSD and Linux. Other platforms with varying degrees of support are NetBSD, Mac OS X and Windows. btrfs is exclusive to Linux.
* Plenty of space:
+
** 256 zettabytes per storage pool (2^64 storages pools max in a system).
+
** 16 exabytes max for a single file
+
** 2^48 entries max per directory
+
* Virtual block-devices support support over a ZFS pool (zvol) - (extremely cool when jointly used  over a RAID-Z volume)
+
* Read-only Snapshot support (it is possible to get a read-write copy of them, those are named clones)
+
* Encryption support (supported only at ZFS version 30 and upper, ZFS version 31 is shipped with Oracle Solaris 11 so that version is mandatory if you plan to encrypt your ZFS datasets/pools)
+
* Built-in''' RAID-5-like-over-steroid capabilities known as [http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z RAID-Z] and RAID-6-like-over-steroid capabilities known as RAID-Z2'''. RAID-Z3 (triple parity) also exists.
+
* Copy-on-Write transactional filesystem
+
* Meta-attributes support (properties) allowing you to you easily drive the show like "That directory is encrypted", "that directory is limited to 5GiB", "That directory is exported via NFS" and so on. Depending on what you define, ZFS takes the appropriates actions!
+
* Dynamic striping to optimize data throughput
+
* Variable block length 
+
* Data deduplication
+
* Automatic pool re-silvering
+
* Transparent data compression
+
* Transparent encryption (Solaris 11 and later only)
+
  
Most notable limitations are:
+
* ZFS has the Adaptive Replacement Cache replacement algorithm while btrfs uses the Linux kernel's Last Recently Used replacement algorithm. The former often has an overwhelmingly superior hit rate, which means fewer disk accesses.
  
* Lack a features ZFS developers knows as "Block Pointer rewrite functionality" (planned to be developed), without it ZFS suffers of currently not being able to:
+
* ZFS has the ZFS Intent Log and SLOG devices, which accelerates small synchronous write performance.
** Pool defragmentation (COW techniques used in ZFS mitigates the problem)
+
** Pool resizing
+
** Data compression (re-applying)
+
** Adding an additional device in a RAID-Z/Z2/Z3 pool to increase it size (however, it is possible to replace in sequence each one of the disks composing a RAID-Z/Z2/Z3)
+
* '''NOT A CLUSTERED FILESYSTEM''' like Lustre, GFS or OCFS2
+
* No data healing if used on a single device (corruption can still be detected), workaround if to force a data duplication on the drive
+
* No support of TRIMming (SSD devices)
+
  
== ZFS on well known operating systems ==
+
* ZFS handles internal fragmentation gracefully, such that you can fill it until 100%. Internal fragmentation in btrfs can make btrfs think it is full at 10%. Btrfs has no automatic rebalancing code, so it requires a manual rebalance to correct it.
  
=== Linux ===
+
* ZFS has raidz, which is like RAID 5/6 (or a hypothetical RAID 7 that supports 3 parity disks), except it does not suffer from the RAID write hole issue thanks to its use of CoW and a variable stripe size. btrfs gained integrated RAID 5/6 functionality in Linux 3.9. However, its implementation uses a stripe cache that can only partially mitigate the effect of the RAID write hole.
  
Despite the source code of ZFS is open, its license (Sun CDDL) is incompatible with the license governing the Linux kernel (GNU GPL v2) thus preventing its direct integration. However a couple of ports exists, but suffers of maturity and lack of features. As of writing (February 2014) two known implementations exists:
+
* ZFS send/receive implementation supports incremental update when doing backups. btrfs' send/receive implementation requires sending the entire snapshot.
  
* [http://zfs-fuse.net ZFS-fuse]: a totally userland implementation relying on FUSE. This implementation can now be considered as defunct as of February  2014). The original site of ZFS FUSE seems to have disappeared nevertheless the source code is still available on [http://freecode.com/projects/zfs-fuse http://freecode.com/projects/zfs-fuse]. ZFS FUSE stalled at version 0.7.0 in 2011 and never really evolved since then.
+
* ZFS supports data deduplication, which is a memory hog and only works well for specialized workloads. btrfs has no equivalent.
* [http://zfsonlinux.org ZFS on Linux]: a kernel mode implementation of ZFS in kernel mode which supports a lot of NFS features. The implementation is not as complete as it is under Solaris and its siblings like OpenIndiana (e.g. SMB integration is still missing, no encryption support...) but a lot of functionality is there. This is the implementation used for this article. As ZFS on Linux is an out-of-tree Linux kernel implementation, patches must be waited after each Linux kernel release. ZfsOnLinux currently supports zpools version 28.
+
  
=== Solaris/OpenIndiana ===
+
* ZFS datasets have a hierarchical namespace while btrfs subvolumes have a flat namespace.
  
* '''Oracle Solaris:''' remains the de facto reference platform for ZFS implementation: ZFS on this platform is now considered as mature and usable on production systems. Solaris 11 uses ZFS even for its "system" pool (aka ''rpool''). A great advantage of this: it is now quite easy to revert the effect of a patch at the condition a snapshot has been taken just before applying it. In the "old good" times of Solaris 10 and before, reverting a patch was possible but could be tricky and complex when possible. ZFS is far from being new in Solaris as it takes its roots in 2005 to be, then, integrated in Solaris 10 6/06 introduced in June 2006.
+
* ZFS has the ability to create virtual block devices called zvols in its namespace. btrfs has no equivalent and must rely on the loop device for this functionality, which is cumbersome.
  
* '''[http://openindiana.org OpenIndiana]:''' is based on the Illuminos kernel (a derivative of the now defunct OpenSolaris) which aims to provide absolute binary compatibility with Sun/Oracle Solaris. Worth mentioning that Solaris kernel and the [https://www.illumos.org Illumos kernel] were both sharing the same code base, however, they now follows a different path since Oracle announced the discontinuation of OpenSolaris (August 13th 2010). Like Oracle Solaris, OpenIndiana uses ZFS for its system pool. The illumos kernel ZFS support lags a bit behind Oracle: it  supports zpool version 28 where as Oracle Solaris 11 has zpool version 31 support, data encryption being supported at zpool version 30.
+
The only area where btrfs is ahead of ZFS is in the area of small file
 +
efficiency. btrfs supports a feature called block suballocation, which
 +
enables it to store small files far more efficiently than ZFS. It is
 +
possible to use another filesystem (e.g. reiserfs) on top of a ZFS zvol
 +
to obtain similar benefits (with arguably better data integrity) when
 +
dealing with many small files (e.g. the portage tree).
  
=== *BSD ===
+
For a quick tour of ZFS and have a big picture of its common operations you can consult the page [[ZFS Fun]].
  
* '''FreeBSD''': ZFS is present in FreeBSD since FreeBSD 7 (zpool version 6) and FreeBSD can boot on a ZFS volume (zfsboot). ZFS support has been vastly enhanced in FreeBSD 8.x (8.2 supports zpool version 15, version 8.3 supports version 28), FreeBSD 9 and FreeBSD 10 (both supports zpool version 28). ZFS in FreeBSD is now considered as fully functional and mature. FreeBSD derivatives such as the popular [http://www.freenas.org FreeNAS] takes befenits of ZFS and integrated it in their tools. In the case of that latter,  it have, for example, supports for zvol though its Web management interface (FreeNAS >= 8.0.1).
+
=== Disclaimers ===
  
* '''NetBSD''': ZFS has been started to be ported as a GSoC project in 2007 and is present in the NetBSD mainstream since 2009 (zpool version 13).  
+
{{fancywarning|This guide is a work in progress. Expect some quirks.}}
 +
{{fancyimportant|'''Since ZFS was really designed for 64 bit systems, we are only recommending and supporting 64 bit platforms and installations. We will not be supporting 32 bit platforms'''!}}
  
* '''OpenBSD''': No ZFS support yet and not planned until Oracle changes some policies according to the project FAQ.
+
== Video Tutorial ==
  
== ZFS alternatives ==
+
As a companion to the installation instructions below, a YouTube video tutorial is now available:
  
* WAFL seems to have severe limitation [http://unixconsult.org/wafl/ZFS%20vs%20WAFL.html] (document is not dated), also an interesting article lies [http://blogs.netapp.com/dave/2008/12/is-wafl-a-files.html here]
+
{{#widget:YouTube|id=SWyThdxNoP8|width=640|height=360}}
* BTRFS is advancing every week but it still lacks such features like the capability of emulating a virtual block device over a storage pool (zvol) and built-in support for RAID-5/6 is not complete yet (cf. [https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg29169.html Btrfs mailing list]). At date of writing, it is still experimental where as ZFS is used on big production servers. 
+
* VxFS has also been targeted by comparisons like [http://blogs.oracle.com/dom/entry/zfs_v_vxfs_iozone this one] (a bit [http://www.symantec.com/connect/blogs/suns-comparision-vxfs-and-zfs-scalability-flawed controversial]). VxFS has been known in the industry since 1993 and is known for its legendary flexibility. Symantec acquired VxFS and proposed a basic version (no clustering for example) of it under the same [http://www.symantec.com/enterprise/sfbasic/index.jsp Veritas Storage Foundation Basic]
+
* An interesting discussion about modern filesystems can be found on [http://www.osnews.com/story/19665/Solaris_Filesystem_Choices OSNews.com]
+
  
== ZFS vs BTRFS at a glance ==
+
== Downloading the ISO (With ZFS) ==
Some key features in no particular order of importance between ZFS and BTRFS:
+
In order for us to install Funtoo on ZFS, you will need an environment that already provides the ZFS tools. Therefore we will download a customized version of System Rescue CD with ZFS included.
  
{| class="wikitable"
+
<pre>
!Feature||ZFS!!BTRFS!!Remarks
+
Name: sysresccd-4.0.1_zfs_0.6.2.iso  (545 MB)
|-
+
Release Date: 2014-02-25
|Transactional filesystem||YES||YES
+
md5sum 01f4e6929247d54db77ab7be4d156d85
|-
+
</pre>
|Journaling||NO||YES||Not a design flaw, but ZFS is robust ''by design''... See page 7 of [http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/zfslast.pdf ''"ZFS The last word on filesystems"''].
+
|-
+
|Dividable pool of data storage||YES||YES
+
|-
+
|Read-only snapshot support||YES||YES
+
|-
+
|Writable snapshot support||YES||YES
+
|-
+
|Sending/Receiving a snapshot over the network||YES||YES
+
|-
+
|Rollback capabilities||YES||YES||While ZFS knows where and how to rollback the data (on-line), BTRFS requires a bit more work from the system administrator (off-line).
+
|-
+
|Virtual block-device emulation||YES||NO
+
|-
+
|Data deduplication||YES||YES||Built-in in ZFS, third party tool ([https://github.com/g2p/bedup bedup]) in BTRFS
+
|-
+
|Data blocks reoptimization||NO||YES||ZFS is missing a "block pointer rewrite functionality", true on all known implementations so far. Not a major performance crippling however. BTRFS can do on-line data defragmentation.
+
|-
+
|Built-in data redundancy support||YES||YES||ZFS has a sort of RAID-5/6 (but better! RAID-Z{1,2,3}) capability, BTRFS only fully supports data mirroring at this point, however some works remains to be done on parity bits handling by BTRFS.
+
|-
+
|Management by attributes||YES||NO||Nearly everything touching ZFS management is related to attributes manipulation (quotas, sharing over NFS, encryption, compression...), BTRFS also retain the concept but it les less aggressively used.
+
|-
+
|Production quality code||NO||NO||ZFS support in Linux is not considered as production quality (yet) although it is very robust. Several operating systems like Solaris/OpenIndiana have a production quality implementation, Solaris/OpenIndiana is now installed in ZFS datasets by defaults.
+
|-
+
|Integrated within the Linux kernel tree||NO||YES||ZFS is released under the CDDL license...
+
|}
+
  
= ZFS resource naming restrictions =
 
  
Before going further, you must be aware of restrictions concerning the names you can use on a ZFS filesystem. The general rule is: you can can use all of the alphanumeric characters plus the following specials are allowed:
+
'''[http://ftp.osuosl.org/pub/funtoo/distfiles/sysresccd/ Download System Rescue CD with ZFS]'''<br />
* Underscore (_)
+
* Hyphen (-)
+
* Colon (:)
+
* Period (.)
+
  
The name used to designate a ZFS pool has no particular restriction except:
+
== Creating a bootable USB from ISO (From a Linux Environment) ==
* it can't use one the reserved words in particular:
+
After you download the iso, you can do the following steps to create a bootable USB:
** ''mirror''
+
** ''raidz'' (''raidz2'', ''raidz3'' and so on)
+
** ''spare''
+
** ''cache''
+
** ''log''
+
* names must begin with an alphanumeric character (same for ZFS datasets).
+
  
= Some ZFS concepts =
 
Once again with no particular order of importance:
 
{|class="wikitable"
 
|-
 
!ZFS!!What it is!!Counterparts examples
 
|-
 
|zpool||A  group of one or many physical storage media (hard drive partition, file...). A zpool has to be divided in at least one '''ZFS dataset''' or at least one '''zvol''' to hold any data. Several zpools can coexists in a system at the condition they each hold a unique name. Also note that '''zpools can never be mounted, the only things that can are the ZFS datasets they hold.'''||
 
* Volume group (VG) in LVM
 
* BTRFS volumes
 
|-
 
|dataset||A logical subdivision of a zpool mounted in your host's VFS where your files and directories resides. Several uniquely named ZFS datasets can coexist in a single system at the conditions they each own a unique name within their zpool.||
 
* Logical subvolumes (LV) in LVM formatted with a filesystem like ext3.
 
* BTRFS subvolumes
 
|-
 
|snapshot||A read-only photo of a ZFS dataset state as is taken at a precise moment of time. ZFS has no way to cooperate on its own with applications that read and write data on ZFS datasets, if those latter still hold data at the moment the snapshot is taken, only what has been flushed will be included in the snapshot. Worth mentioning that snapshot do not take diskspace aside of sone metadata at the exact time they are created, they size will grow as more and data blocks (i.e. files) are deleted or changed on their corresponding live ZFS dataset.||
 
* No direct equivalent in LVM.
 
* BTRFS read-only snapshots
 
|-
 
|clone||What is is... A writable physical clone of snapshot||
 
* LVM snapshots
 
* BTRFS snapshots
 
|-
 
|zvol||An emulated block device whose data is hold behind the scene in the zpool the zvol has been created in.||No known equivalent even in BTRFS
 
|-
 
|}
 
 
= Your first contact with ZFS  =
 
== Requirements ==
 
* ZFS userland tools installed (package ''sys-fs/zfs'')
 
* ZFS kernel modules built and installed (package ''sys-fs/zfs-kmod''), there is a known issue with kernel 3.13 series see [http://forums.funtoo.org/viewtopic.php?id=2442 this thread on Funtoo's forum]
 
* Disk size of 64 Mbytes as a bare minimum (128 Mbytes is the minimum size of a pool). Multiple disk will be simulated through the use of several raw images accessed via the Linux loopback devices.
 
* At least 512 MB of RAM
 
 
== Preparing ==
 
Once your have emerged ''sys-fs/zfs'' and ''sys-fs/zfs-kmod'' you have two options to start using ZFS at this point :
 
* Either you start ''/etc/init.d/zfs'' (will load all of the zfs kernel modules for you plus a couple of other things)
 
* Either you load the zfs kernel modules by hand (will load all of the zfs kernel modules for you)
 
 
So :
 
<console>###i## rc-service zfs start</console>
 
 
Or:
 
 
<console>
 
<console>
###i## modprobe zfs
+
Make a temporary directory
###i## lsmod | grep zfs
+
# ##i##mkdir /tmp/loop
zfs                  874072  0
+
zunicode              328120  1 zfs
+
zavl                  12997  1 zfs
+
zcommon                35739  1 zfs
+
znvpair                48570  2 zfs,zcommon
+
spl                    58011  5 zfs,zavl,zunicode,zcommon,znvpair
+
</console>
+
  
== Your first ZFS pool ==
+
Mount the iso
To start with, four raw disks (2 GB each) are created:
+
# ##i##mount -o ro,loop /root/sysresccd-4.0.1_zfs_0.6.2.iso /tmp/loop
  
<console>
+
Run the usb installer
###i## for i in 0 1 2 3; do dd if=/dev/zero of=/tmp/zfs-test-disk0${i}.img bs=2G count=1; done
+
# ##i##/tmp/loop/usb_inst.sh
0+1 records in
+
0+1 records out
+
2147479552 bytes (2.1 GB) copied, 40.3722 s, 53.2 MB/s
+
...
+
 
</console>
 
</console>
  
Then let's see what loopback devices are in use and which is the first free:
+
That should be all you need to do to get your flash drive working.
  
<console>
+
== Booting the ISO ==
###i## losetup -a
+
###i## losetup -f
+
/dev/loop0
+
</console>
+
  
In the above example nothing is used and the first available loopback device is /dev/loop0. Now associate all of the disks with a loopback device (/tmp/zfs-test-disk00.img -> /dev/loop/0, /tmp/zfs-test-disk01.img -> /dev/loop/1 and so on):
+
{{fancywarning|'''When booting into the ISO, Make sure that you select the "Alternate 64 bit kernel (altker64)". The ZFS modules have been built specifically for this kernel rather than the standard kernel. If you select a different kernel, you will get a fail to load module stack error message.'''}}
  
<console>
+
== Creating partitions ==
###i## for i in 0 1 2 3; do losetup /dev/loop${i} /tmp/zfs-test-disk0${i}.img; done
+
There are two ways to partition your disk: You can use your entire drive and let ZFS automatically partition it for you, or you can do it manually.
###i## losetup -a
+
/dev/loop0: [000c]:781455 (/tmp/zfs-test-disk00.img)
+
/dev/loop1: [000c]:806903 (/tmp/zfs-test-disk01.img)
+
/dev/loop2: [000c]:807274 (/tmp/zfs-test-disk02.img)
+
/dev/loop3: [000c]:781298 (/tmp/zfs-test-disk03.img)
+
</console>
+
  
{{Fancynote|ZFS literature often names zpools "tank", this is not a requirement you can use whatever name of you choice (as we did here...) }}
+
We will be showing you how to partition it '''manually''' because if you partition it manually you get to create your own layout, you get to have your own separate /boot partition (Which is nice since not every bootloader supports booting from ZFS pools), and you get to boot into RAID10, RAID5 (RAIDZ) pools and any other layouts due to you having a separate /boot partition.
  
Every story in ZFS takes its root with a the very first ZFS related command you will be in touch with: '''zpool'''. '''zpool''' as you might guessed manages all ZFS aspects in connection with the physical devices underlying your ZFS storage spaces and the very first task is to use this command to make what is called a ''pool'' (if you have used LVM before, volume groups can be seen as a counter part). Basically what you will do here is to tell ZFS to take a collection of physical storage stuff which can take several forms like a hard drive partition, a USB key partition or even a file and consider all of them as a single pool of storage (we will subdivide it in following paragraphs). No black magic here, ZFS will write some metadata on them behind the scene to be able to track which physical device belongs to what pool of storage.
+
==== gdisk (GPT Style) ====
  
<console>
+
'''A Fresh Start''':
###i## zpool create myfirstpool /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
+
</console>
+
  
And.. nothing! Nada! The command silently returned but it ''did'' something, the next section will explain what.
+
First lets make sure that the disk is completely wiped from any previous disk labels and partitions.
 +
We will also assume that <tt>/dev/sda</tt> is the target drive.<br />
  
== Your first ZFS dataset ==
 
 
<console>
 
<console>
###i## zpool list
+
# ##i##sgdisk -Z /dev/sda
NAME          SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
+
myfirstpool  7.94G  130K  7.94G    0%  1.00x  ONLINE  -
+
 
</console>
 
</console>
  
What does this mean? Several things: First, your zpool is here and has a size of, roughly, 8 Go minus some space eaten by some metadata. Second is is actually usable because the column ''HEALTH'' says ''ONLINE''. Other columns are not meaningful for us for the moment just ignore them. If want more crusty details you can use the zpool command like this:
+
{{fancywarning|This is a destructive operation and the program will not ask you for confirmation! Make sure you really don't want anything on this disk.}}
  
<console>
+
Now that we have a clean drive, we will create the new layout.
###i## zpool status
+
  pool: myfirstpool
+
state: ONLINE
+
  scan: none requested
+
config:
+
  
        NAME        STATE    READ WRITE CKSUM
+
First open up the application:
        myfirstpool  ONLINE      0    0    0
+
          loop0    ONLINE      0    0    0
+
          loop1    ONLINE      0    0    0
+
          loop2    ONLINE      0    0    0
+
          loop3    ONLINE      0    0    0
+
</console>
+
Information is quite intuitive: your pool is seen as being usable (''state'' is similar to ''HEALTH'') and is composed of several devices each one listed as being in a ''healthy'' state ... at least for now because they will be salvaged for demonstration purpose in a later section. For your information the columns ''READ'',''WRITE'' and ''CKSUM'' list the number of operation failures on each of the devices respectfully:  
+
* ''READ'' for reading failures. Having a non-zero value is not a good sign... the device is clunky and will soon fail.
+
* ''WRITE'' for writing failures. Having a non-zero value is not a good sign... the device is clunky and will soon fail.
+
* ''CKSUM'' for mismatch between the checksum of the data at the time is had been written and how it has been recomputed when read again (yes, ZFS uses checksums in a agressive manner). Having a non-zero value is not a good sign... corruption happened, ZFS will do its best to recover data by its own but this is definitely not a good sign of a healthy system.
+
 
+
Cool! So far so good you have a new 8 Gb usable brand new storage space on you system. Has been mounted somewhere?
+
  
 
<console>
 
<console>
###i## mount | grep myfirstpool
+
# ##i##gdisk /dev/sda
/myfirstpool on /myfirstpool type zfs (rw,xattr)
+
 
</console>
 
</console>
  
Remember the tables in the section above? A zpool in itself can '''never be mounted''', never ''ever''. It is just a container where ZFS datasets are created then mounted. So what happened here? Obscure black magic? No, of course not! Indeed a ZFS dataset named after the zpool's name should have been created automatically for us then mounted. Is is true? We will check this shortly. For the moment you will be introduced with the second command you will deal with when using ZFS : '''zfs'''. While the '''zpool''' command is used with anything related to zpools, the '''zfs''' is used to anything related to ZFS datasets '''(a ZFS dataset ''always'' resides in a zpool, ''always'' no exception on that).'''
+
'''Create Partition 1''' (boot):
 
+
{{Fancynote|'''zfs''' and '''zpool''' commands are the two only ones you will need to remember when dealing with ZFS.}}
+
 
+
So how can we check what ZFS datasets are currently known by the system? As you might already guessed like this:
+
 
+
 
<console>
 
<console>
###i## zfs list
+
Command: ##i##n ↵
NAME          USED  AVAIL  REFER  MOUNTPOINT
+
Partition Number: ##i##
myfirstpool  114K  7.81G    30K  /myfirstpool
+
First sector: ##i##↵
 +
Last sector: ##i##+250M ↵
 +
Hex Code: ##i##↵
 
</console>
 
</console>
  
Tala! The mystery is busted! the ''zfs'' command tells us that not only a ZFS dataset named ''myfirstpool'' has been created but also it has been mounted in the system's VFS for us. If you check with the ''df'' command, you should also see something like this:
+
'''Create Partition 2''' (BIOS Boot Partition):
 
+
<console>Command: ##i##n ↵
<console>
+
Partition Number: ##i##↵
###i## df -h
+
First sector: ##i##
Filesystem      Size  Used Avail Use% Mounted on
+
Last sector: ##i##+32M ↵
(...)
+
Hex Code: ##i##EF02 ↵
myfirstpool    7.9G    0  7.9G  0% /myfirstpool
+
 
</console>
 
</console>
  
The $100 question:''"what to do with this band new ZFS /myfirstpool dataset ?"''. Copy some files on it of course! We used a Linux kernel source but you can of course use whatever you want:
+
'''Create Partition 3''' (ZFS):
<console>
+
<console>Command: ##i##n ↵
###i## cp -a /usr/src/linux-3.13.5-gentoo /myfirstpool
+
Partition Number: ##i##
###i## ln -s /myfirstpool/linux-3.13.5-gentoo /myfirstpool/linux
+
First sector: ##i##
###i## ls -lR /myfirstpool
+
Last sector: ##i##↵
/myfirstpool:
+
Hex Code: ##i##bf00 ↵
total 3
+
lrwxrwxrwx  1 root root 32 Mar  2 14:02 linux -> /myfirstpool/linux-3.13.5-gentoo
+
drwxr-xr-x 25 root root 50 Feb 27 20:35 linux-3.13.5-gentoo
+
  
/myfirstpool/linux-3.13.5-gentoo:
+
Command: ##i##p ↵
total 31689
+
-rw-r--r--  1 root root    18693 Jan 19 21:40 COPYING
+
-rw-r--r--  1 root root    95579 Jan 19 21:40 CREDITS
+
drwxr-xr-x 104 root root      250 Feb 26 07:39 Documentation
+
-rw-r--r--  1 root root    2536 Jan 19 21:40 Kbuild
+
-rw-r--r--  1 root root      277 Feb 26 07:39 Kconfig
+
-rw-r--r--  1 root root  268770 Jan 19 21:40 MAINTAINERS
+
(...)
+
</console>
+
  
A ZFS dataset behaves like any other filesystem: you can create regular files, symbolic links, pipes, special devices nodes, etc. Nothing mystic here.
+
Number Start (sector)    End (sector)  Size      Code  Name
 +
  1            2048          514047  250.0 MiB  8300  Linux filesystem
 +
  2          514048          579583  32.0 MiB    EF02  BIOS boot partition
 +
  3          579584      1953525134  931.2 GiB  BF00  Solaris root
  
Now we have some data in the ZFS dataset let's see what various commands report:
+
Command: ##i##w ↵
<console>
+
###i## df -h
+
Filesystem      Size  Used Avail Use% Mounted on
+
(...)
+
myfirstpool    7.9G  850M  7.0G  11% /myfirstpool
+
 
</console>
 
</console>
<console>
 
###i## zfs list
 
NAME          USED  AVAIL  REFER  MOUNTPOINT
 
myfirstpool  850M  6.98G  850M  /myfirstpool
 
</console>
 
<console>
 
###i## zpool list
 
NAME          SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
 
myfirstpool  7.94G  850M  7.11G    10%  1.00x  ONLINE  -
 
</console>
 
{{Fancynote|Notice the various sizes reported by '''zpool''' and '''zfs''' commands. In this case it is  the same however it can differ, this is true especially with zpools mounted in RAID-Z.}}
 
  
== Unmounting/remounting a ZFS dataset ==
 
  
 +
=== Format your /boot partition ===
  
{{Fancyimportant|'''Only ZFS datasets can be mounted''' inside your host's VFS, no exception on that! Zpools cannot be mounted, never, never, never... please pay attention to the  terminology and keep things clear by not messing up with terms. We will introduce ZFS snapshots and ZFS clones but those are ZFS datasets at the basis so they can also be mounted and unmounted.}}
 
 
 
If a ZFS dataset behaves just like any other filesystem, can we unmount it?
 
 
<console>
 
<console>
###i## umount /myfirstpool
+
# ##i##mkfs.ext2 -m 1 /dev/sda1
###i## mount | grep myfirstpool
+
 
</console>
 
</console>
  
No more ''/myfirstpool'' the line of sight! So yes, it is possible to unmount a ZFS dataset just like you would do with any other filesystem. Is the ZFS dataset still present on the system even it is unmounted? Let's check:
+
=== Encryption (Optional) ===
 +
If you want encryption, then create your encrypted vault(s) now by doing the following:
  
 
<console>
 
<console>
###i## zfs list
+
# ##i##cryptsetup luksFormat /dev/sda3
NAME          USED  AVAIL  REFER  MOUNTPOINT
+
# ##i##cryptsetup luksOpen /dev/sda3 vault_1
myfirstpool  850M  6.98G  850M  /myfirstpool
+
 
</console>
 
</console>
  
Hopefully and obviously it is else ZFS would not be very useful. Your next concern would certainly be: "How can we remount it then?" Simple! Like this:
+
{{fancywarning|On some machines, a combination of ZFS and LUKS has caused instability and system crashes.}}
<console>
+
###i## zfs mount myfirstpool
+
###i## mount | grep myfirstpool
+
myfirstpool on /myfirstpool type zfs (rw,xattr)
+
</console>
+
  
The ZFS dataset is back! :-)
+
=== Create the zpool ===
 +
We will first create the pool. The pool will be named `tank` and the disk will be aligned to 4096 (using ashift=12)
 +
<console># ##i##zpool create -f -o ashift=12 -o cachefile= -O compression=on -m none -R /mnt/funtoo tank /dev/sda3</console>
  
== Your first contact with ZFS management by attributes or the end of /etc/fstab ==
+
{{fancyimportant|If you are using encrypted root, change '''/dev/sda3 to /dev/mapper/vault_1'''.}}
At this point you might be curious about how the '''zfs''' command know what it has to mount and ''where'' is has to mount it. You might be familiar with the following syntax of the '''mount''' command that, behind the scenes, scans the file ''/etc/fstab'' and mount the specified entry:
+
 
<console>
+
{{fancynote| If you have a previous pool that you would like to import, you can do a: '''zpool import -f -R /mnt/funtoo <pool_name>'''.}}
###i## mount /boot
+
</console>
+
  
Does ''/etc/fstab'' contain something related to our ZFS dataset?
+
=== Create the zfs datasets ===
 +
We will now create some datasets. For this installation, we will create a small but future proof amount of datasets. We will have a dataset for the OS (/), and your swap. We will also show you how to create some optional datasets: <tt>/home</tt>, <tt>/var</tt>, <tt>/usr/src</tt>, and <tt>/usr/portage</tt>.
  
 
<console>
 
<console>
###i## cat /etc/fstab | grep myfirstpool
+
Create some empty containers for organization purposes, and make the dataset that will hold /
#
+
# ##i##zfs create -p tank/funtoo
</console>
+
# ##i##zfs create -o mountpoint=/ tank/funtoo/root
  
Doh!!!... Obvisouly nothing there. Another mystery? Sure not! The answer lies in a extremely powerful feature of ZFS: the attributes. Simply speaking: an attribute is named property of a ZFS dataset that holds a value. Attributes govern various aspects of how the datasets are managed like: ''"Is the data has to be compressed?"'', ''"Is the data has to be encrypted?"'', ''"Is the data has to be exposed to the rest of the world by NFS or SMB/Samba?"'' and of course... '''"Where the dataset has to be mounted?"''. The answer to that latter question can be tell by the following command:
+
Optional, but recommended datasets: /home
 +
# ##i##zfs create -o mountpoint=/home tank/funtoo/home
  
<console>
+
Optional datasets: /usr/src, /usr/portage/{distfiles,packages}
###i## zfs get mountpoint myfirstpool
+
# ##i##zfs create -o mountpoint=/usr/src tank/funtoo/src
NAME        PROPERTY    VALUE        SOURCE
+
# ##i##zfs create -o mountpoint=/usr/portage -o compression=off tank/funtoo/portage
myfirstpool  mountpoint /myfirstpool  default
+
# ##i##zfs create -o mountpoint=/usr/portage/distfiles tank/funtoo/portage/distfiles
 +
# ##i##zfs create -o mountpoint=/usr/portage/packages tank/funtoo/portage/packages
 
</console>
 
</console>
  
Bingo! When you remounted the dataset just some paragraphs ago, ZFS automatically inspected the ''mountpoint'' attribute and saw this dataset has to be mounted in the directory ''/myfirstpool''.
+
=== Create your swap zvol ===
 +
For modern machines that have greater than 4 GB of RAM, A swap size of 2G should be enough. However if your machine doesn't have a lot of RAM, the rule of thumb is either 2x the RAM or RAM + 1 GB.
  
= A step forward with ZFS datasets =
+
For this tutorial we will assume that it is a newer machine and make a 2 GB swap.
 
+
So far you were given a quick tour of what ZFS can do for you and  it is very important at this point to distinguish a ''zpool'' from a ''ZFS dataset'' and to call a dataset for what it is (a dataset) and not for what is is not (a zpool). It is a bit confusing and an editorial choice to have choosen a confusing name just to make you familiar with the one and the other.
+
 
+
== Creating datasets ==
+
 
+
Obviously it is possible to have more than one ZFS dataset within a single zpool. Quizz: what command would you use to subdivide a zpool in datasets? '''zfs''' or '''zpool'''? Stops reading for two seconds and try to figure out this little question. Frankly.
+
 
+
Answer is... '''zfs'''! Although you want to operate on the zpool to logically subdivide it in several datasets, you manage datasets at the end thus you will use the '''zfs''' command. It is not always easy at the beginning, do not be too worry you will soon get the habit when to use one or the other. Creating a dataset in a zpool is easy: just give to the '''zfs''' command the name of the pool you want to divide and the name of the dataset you want to create in it. So let's create three datasets named ''myfirstDS'', ''mysecondDS'' and ''mythirdDS'' in ''myfirstpool''(observe how we use the zpool and datasets' names) :
+
  
 
<console>
 
<console>
###i## zfs create myfirstpool/myfirstDS
+
# ##i##zfs create -o sync=always -o primarycache=metadata -o secondarycache=none -o volblocksize=4K -V 2G tank/swap
###i## zfs create myfirstpool/mysecondDS
+
###i## zfs create myfirstpool/mythirdDS
+
 
</console>
 
</console>
  
What happened? Let's check :
+
=== Format your swap zvol ===
 
+
 
<console>
 
<console>
###i## zfs list
+
# ##i##mkswap -f /dev/zvol/tank/swap
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
# ##i##swapon /dev/zvol/tank/swap
myfirstpool              850M  6.98G  850M  /myfirstpool
+
myfirstpool/myfirstDS    30K  6.98G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS    30K  6.98G    30K  /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS    30K  6.98G    30K  /myfirstpool/mythirdDS
+
 
</console>
 
</console>
  
Obviously we have there what we asked. Moreover if we inspect the contents of ''/myfirstpool'' we can notice three new directories having the same than just created:
+
Now we will continue to install funtoo.
 +
 
 +
== Installing Funtoo ==
 +
 
 +
=== Pre-Chroot ===
  
 
<console>
 
<console>
###i## ls -l /myfirstpool
+
Go into the directory that you will chroot into
total 8
+
# ##i##cd /mnt/funtoo
lrwxrwxrwx  1 root root 32 Mar  2 14:02 linux -> /myfirstpool/linux-3.13.5-gentoo
+
drwxr-xr-x 25 root root 50 Feb 27 20:35 linux-3.13.5-gentoo
+
drwxr-xr-x  2 root root  2 Mar  2 15:26 myfirstDS
+
drwxr-xr-x  2 root root  2 Mar  2 15:26 mysecondDS
+
drwxr-xr-x  2 root root  2 Mar  2 15:26 mythirdDS
+
</console>
+
No surprise here! As you might have guessed, those three new directories serves as mountpoints:
+
  
<console>
+
Make a boot folder and mount your boot drive
###i## mount | grep myfirstpool
+
# ##i##mkdir boot
myfirstpool on /myfirstpool type zfs (rw,xattr)
+
# ##i##mount /dev/sda1 boot
myfirstpool/myfirstDS on /myfirstpool/myfirstDS type zfs (rw,xattr)
+
myfirstpool/mysecondDS on /myfirstpool/mysecondDS type zfs (rw,xattr)
+
myfirstpool/mythirdDS on /myfirstpool/mythirdDS type zfs (rw,xattr)
+
 
</console>
 
</console>
  
As we did before, we can copy some files in the newly created datasets just like they were regular directories:
+
[[Funtoo_Linux_Installation|Now download and extract the Funtoo stage3 ...]]
  
<console>
+
Once you've extracted the stage3, do a few more preparations and chroot into your new funtoo environment:
###i## cp -a /usr/portage /myfirstpool/mythirdDS
+
###i## ls -l /myfirstpool/mythirdDS/*
+
total 697
+
drwxr-xr-x  48 root root  49 Aug 18  2013 app-accessibility
+
drwxr-xr-x  238 root root  239 Jan 10 06:22 app-admin
+
drwxr-xr-x    4 root root    5 Dec 28 08:54 app-antivirus
+
drwxr-xr-x  100 root root  101 Feb 26 07:19 app-arch
+
drwxr-xr-x  42 root root  43 Nov 26 21:24 app-backup
+
drwxr-xr-x  34 root root  35 Aug 18  2013 app-benchmarks
+
drwxr-xr-x  66 root root  67 Oct 16 06:39 app-cdr(...)
+
</console>
+
 
+
Nothing really too exciting here, we have file in ''mythirdDS''. A bit more interesting output:
+
  
 
<console>
 
<console>
###i## zfs list
+
Bind the kernel related directories
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
# ##i##mount -t proc none proc
myfirstpool            1.81G  6.00G  850M  /myfirstpool
+
# ##i##mount --rbind /dev dev
myfirstpool/myfirstDS    30K  6.00G    30K  /myfirstpool/myfirstDS
+
# ##i##mount --rbind /sys sys
myfirstpool/mysecondDS    30K  6.00G    30K  /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS  1002M  6.00G  1002M  /myfirstpool/mythirdDS
+
</console>
+
<console>
+
###i## df -h
+
Filesystem              Size  Used Avail Use% Mounted on
+
(...)
+
myfirstpool            6.9G  850M  6.1G  13% /myfirstpool
+
myfirstpool/myfirstDS  6.1G    0  6.1G  0% /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS  6.1G    0  6.1G  0% /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS  7.0G 1002M  6.1G  15% /myfirstpool/mythirdDS
+
</console>
+
  
Noticed the size given for the 'AVAIL' column? At the very beginning of this tutorial we had slightly less than  8 Gb of available space, it now has a value of roughly 6 Gb. The datasets are just a subdivision of the zpool, they '''compete with each others''' for using the available storage within the zpool, no miracle here. To what limit? The pool itself as we never imposed a ''quota'' on datasets. Hopefully '''df''' and '''zfs list''' gives a coherent result.
+
Copy network settings
 +
# ##i##cp -f /etc/resolv.conf etc
  
== Second contact with attributes: quota management ==
+
Make the zfs folder in 'etc' and copy your zpool.cache
 +
# ##i##mkdir etc/zfs
 +
# ##i##cp /etc/zfs/zpool.cache etc/zfs
  
Remember how painful is the quota management under Linux? Now you can say goodbye to '''setquota''', '''edquota''' and other '''quotacheck''' commands, ZFS handle this in the snap of fingers! Guess with what? An ZFS dataset attribute of course! ;-) Just to make you drool here is how a 2Gb limit can be set on ''myfirstpool/mythirdDS'' :
+
Chroot into Funtoo
 
+
# ##i##env -i HOME=/root TERM=$TERM chroot . bash -l
<console>
+
###i## zfs set quota=2G myfirstpool/mythirdDS
+
 
</console>
 
</console>
  
''Et voila!'' The '''zfs''' command is bit silent however if we check we can see that ''myfirstpool/mythirdDS'' is now capped to 2 Gb (forget about 'REFER' for the moment): around 1 Gb of data has been copied in this dataset thus leaving a big 1 Gb of available space.
+
=== In Chroot ===
  
 
<console>
 
<console>
###i## zfs list
+
Create a symbolic link to your mountpoints
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
# ##i##ln -sf /proc/mounts /etc/mtab
myfirstpool            1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS    30K  6.00G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS    30K  6.00G    30K  /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS  1002M  1.02G  1002M  /myfirstpool/mythirdDS
+
</console>
+
  
Using the '''df''' command:
+
Sync your tree
 
+
# ##i##emerge --sync
<console>
+
###i## df -h                               
+
Filesystem              Size  Used Avail Use% Mounted on
+
(...)
+
myfirstpool            6.9G  850M  6.1G  13% /myfirstpool
+
myfirstpool/myfirstDS  6.1G    0  6.1G  0% /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS  6.1G    0  6.1G  0% /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS  2.0G 1002M  1.1G  49% /myfirstpool/mythirdDS
+
 
</console>
 
</console>
  
Of course you can use this technique for the home directories of your users /home this also having the a advantage of being much less forgiving than a soft/hard user quota: when the limit is reached, it is reached period and no more data can be written in the dataset. The user must do some cleanup and cannot procastinate anymore :-)
+
=== Add filesystems to /etc/fstab ===
  
To remove the quota:
+
Before we continue to compile and or install our kernel in the next step, we will edit the <tt>/etc/fstab</tt> file because if we decide to install our kernel through portage, portage will need to know where our <tt>/boot</tt> is, so that it can place the files in there.
  
<console>
+
Edit <tt>/etc/fstab</tt>:
###i## zfs set quota=none myfirstpool/mythirdDS
+
</console>
+
  
''none'' is simply the original value for the ''quota'' attribute (we did not demonstrate it, you can check by doing a '''zfs get quota  myfirstpool/mysecondDS''' for example).
+
<pre>
 +
# <fs>                  <mountpoint>    <type>          <opts>          <dump/pass>
  
== Destroying datasets ==
+
/dev/sda1              /boot          ext2            defaults        0 2
{{Fancyimportant|There is no way to resurrect a destroyed ZFS dataset and the data it contained! Once you destroy a dataset the corresponding metadata is cleared and gone forever so be careful when using ''zfs destroy'' notably with the ''-r'' option ... }}
+
/dev/zvol/tank/swap    none            swap            sw              0 0
 +
</pre>
  
 +
== Kernel Configuration ==
 +
To speed up this step, you can install a pre-configured/compiled kernel called '''bliss-kernel'''. This kernel already has the correct configurations for ZFS and a variety of other scenarios. It's a vanilla kernel from kernel.org without any external patches.
  
We have three datasets, but the third is pretty useless and contains a lot of garbage. Is it possible to remove it with a simple '''rm -rf'''? Let's try:
+
To install {{Package|sys-kernel/bliss-kernel}} type the following:
  
 
<console>
 
<console>
###i## rm -rf /myfirstpool/mythirdDS
+
# ##i##emerge bliss-kernel
rm: cannot remove `/myfirstpool/mythirdDS': Device or resource busy
+
 
</console>
 
</console>
  
This is perfectly normal, remember that datasets are indeed something '''mounted''' in your VFS. ZFS might be ZFS and do alot for you, it cannot enforce the nature of a mounted filesystem under Linux/Unix. The "ZFS way" to remove a dataset is to use the ''zfs'' command like this at the reserve no process owns open files on it (once again, ZFS can do miracles for you but not that kind of miracles as it has to unmount the dataset before deleting it):
+
Now make sure that your <tt>/usr/src/linux symlink</tt> is pointing to this kernel by typing the following:
  
 
<console>
 
<console>
###i## zfs destroy myfirstpool/mythirdDS
+
# ##i##eselect kernel list
###i## zfs list
+
Available kernel symlink targets:
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
[1]   linux-3.12.13-KS.02 *
myfirstpool              444M  7.38G   444M  /myfirstpool
+
myfirstpool/myfirstDS    21K  7.38G    21K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS    21K  7.38G    21K  /myfirstpool/mysecondDS
+
 
</console>
 
</console>
  
''Et voila''! No more ''myfirstpool/mythirdDS'' dataset. :-)
+
You should see a star next to the version you installed. In this case it was 3.12.13-KS.02. If it's not set, you can type '''eselect kernel set #'''.
  
A bit more subtle case would be to try to destroy a ZFS dataset whenever another ZFS dataset is nested in it. Before doing that nasty experiment  ''myfirstpool/mythirdDS'' must be created again this time with another nested dataset (''myfirstpool/mythirdDS/nestedSD1''):
+
== Installing the ZFS userspace tools and kernel modules ==
 +
Emerge {{Package|sys-fs/zfs}}. This package will bring in {{Package|sys-kernel/spl}}, and {{Package|sys-fs/zfs-kmod}} as its dependencies:
  
 
<console>
 
<console>
###i## zfs create myfirstpool/mythirdDS
+
# ##i##emerge zfs
###i## zfs create myfirstpool/mythirdDS/nestedSD1
+
###i## zfs list
+
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                      851M  6.98G  850M  /myfirstpool
+
myfirstpool/myfirstDS              30K  6.98G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS            30K  6.98G    30K  /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS            124K  6.98G    34K  /myfirstpool/mythirdDS
+
myfirstpool/mythirdDS/nestedDS1    30K  6.98G    30K  /myfirstpool/mythirdDS/nestedDS1
+
 
</console>
 
</console>
  
Now let's try to destroy ''myfirstpool/mythirdDS'' again:
+
Check to make sure that the zfs tools are working. The <code>zpool.cache</code> file that you copied before should be displayed.
  
 
<console>
 
<console>
###i## zfs destroy myfirstpool/mythirdDS
+
# ##i##zpool status
cannot destroy 'myfirstpool/mythirdDS': filesystem has children
+
# ##i##zfs list
use '-r' to destroy the following datasets:
+
myfirstpool/mythirdDS/nestedDS1
+
 
</console>
 
</console>
  
The zfs command detected the situation  and refused to proceed on the deletion without your consent to make a recursive destruction (-r parameter). Before going any step further let's create some more nested datasets plus a couple of directories inside ''myfirstpool/mythirdDS'':
+
If everything worked, continue.
  
<console>
+
== Installing & Configuring the Bootloader ==
###i## zfs create myfirstpool/mythirdDS/nestedDS1
+
###i## zfs create myfirstpool/mythirdDS/nestedDS2
+
###i## zfs create myfirstpool/mythirdDS/nestedDS3
+
###i## zfs create myfirstpool/mythirdDS/nestedDS3/nestednestedDS
+
###i## mkdir /myfirstpool/mythirdDS/dir1
+
###i## mkdir /myfirstpool/mythirdDS/dir2
+
###i## mkdir /myfirstpool/mythirdDS/dir3
+
</console>
+
<console>
+
###i## zfs list
+
NAME                                            USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                                      851M  6.98G  850M  /myfirstpool
+
myfirstpool/myfirstDS                            30K  6.98G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS                            30K  6.98G    30K  /myfirstpool/mysecondDS
+
myfirstpool/mythirdDS                            157K  6.98G    37K  /myfirstpool/mythirdDS
+
myfirstpool/mythirdDS/nestedDS1                  30K  6.98G    30K  /myfirstpool/mythirdDS/nestedDS1
+
myfirstpool/mythirdDS/nestedDS2                  30K  6.98G    30K  /myfirstpool/mythirdDS/nestedDS2
+
myfirstpool/mythirdDS/nestedDS3                  60K  6.98G    30K  /myfirstpool/mythirdDS/nestedDS3
+
myfirstpool/mythirdDS/nestedDS3/nestednestedDS    30K  6.98G    30K  /myfirstpool/mythirdDS/nestedDS3/nestednestedDS
+
</console>
+
 
+
Now what happens if ''myfirstpool/mythirdDS'' is destroyed again with '-r'?
+
  
 +
=== GRUB 2 (Optional if you are using another bootloader) ===
 
<console>
 
<console>
###i## zfs destroy -r myfirstpool/mythirdDS
+
# ##i##emerge grub
###i## zfs list                           
+
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool              851M  6.98G  850M  /myfirstpool
+
myfirstpool/myfirstDS    30K  6.98G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS    30K  6.98G    30K  /myfirstpool/mysecondDS
+
 
</console>
 
</console>
  
''myfirstpool/mythirdDS'' and everything it contained is now gone!
+
You can check that grub is version 2.00 by typing the following command:
 
+
== Snapshotting and rolling back datasets ==
+
 
+
This is, by far, one of the coolest features of ZFS. You can:
+
# take a photo of a dataset (this photo is called a ''snapshot'')
+
# do ''whatever'' you want with the data contained in the dataset
+
# restore (roll back) the dataset in  in the '''exact''' same state it was before you did your changes just as if nothing had ever happened in the middle.
+
 
+
=== Single snapshot ===
+
 
+
{{Fancyimportant|'''Only ZFS datasets''' can be snapshotted and rolled back, not the zpool.}}
+
 
+
 
+
To start with, let's copy some files in ''mysecondDS'':
+
  
 
<console>
 
<console>
###i## cp -a /usr/portage /myfirstpool/mysecondDS
+
# ##i##grub-install --version
###i## ls /myfirstpool/mysecondDS/portage
+
grub-install (GRUB) 2.00
total 672
+
drwxr-xr-x  48 root root  49 Aug 18  2013 app-accessibility
+
drwxr-xr-x  238 root root  239 Jan 10 06:22 app-admin
+
drwxr-xr-x    4 root root    5 Dec 28 08:54 app-antivirus
+
drwxr-xr-x  100 root root  101 Feb 26 07:19 app-arch
+
drwxr-xr-x  42 root root  43 Nov 26 21:24 app-backup
+
drwxr-xr-x  34 root root  35 Aug 18  2013 app-benchmarks
+
(...)
+
drwxr-xr-x  62 root root  63 Feb 20 06:47 x11-wm
+
drwxr-xr-x  16 root root  17 Aug 18  2013 xfce-base
+
drwxr-xr-x  64 root root  65 Dec 14 19:09 xfce-extra
+
 
</console>
 
</console>
  
Now, let's take a snapshot of ''mysecondDS''. What command would be used? '''zpool''' or '''zfs'''? In that case it is '''zfs''' because we manipulate a ZFS dataset (this time you problably got it right!):
+
Now install grub to the drive itself (not a partition):
 
+
 
<console>
 
<console>
###i## zfs snapshot myfirstpool/mysecondDS@Charlie
+
# ##i##grub-install /dev/sda
 
</console>
 
</console>
  
{{fancynote|The syntax is always ''pool/dataset@snapshot'', the snapshot's name is left at your discretion however '''you must use an arobase  sign (@)''' to separate the snapshot's name from the rest of the path.}}
+
You should receive the following message:
  
Let's check what ''/myfirstpool/mysecondDS'' contains after taking the snapshot:
 
 
<console>
 
<console>
###i## ls -la /myfirstpool/mysecondDS   
+
Installation finished. No error reported.
total 9
+
drwxr-xr-x  3 root root  3 Mar  2 18:22 .
+
drwxr-xr-x  5 root root  6 Mar  2 17:58 ..
+
drwx------ 170 root root 171 Mar  2 18:36 portage
+
 
</console>
 
</console>
  
Nothing really new the ''portage'' directory is here nothing more ''a priori''. If you have used BTRFS before reading this tutorial you probably expected to see a ''@Charlie'' lying in ''/myfirstpool/mysecondDS''? So where the check is ''Charlie''? In ZFS a dataset snapshot is not visible from within the VFS tree (if you are not convinced you can search for it with the '''find''' command but it will never find it). Let's check with the '''zfs''' command:
+
You should now see some a grub directory with some files inside your /boot folder:
  
 
<console>
 
<console>
###i## zfs list
+
# ##i##ls -l /boot/grub
###i## zfs list -t all   
+
total 2520
NAME                            USED  AVAIL  REFER  MOUNTPOINT
+
-rw-r--r-- 1 root root    1024 Jan 4 16:09 grubenv
myfirstpool                    1.81G 6.00G  850M  /myfirstpool
+
drwxr-xr-x 2 root root   8192 Jan 12 14:29 i386-pc
myfirstpool/myfirstDS            30K  6.00G   30K  /myfirstpool/myfirstDS
+
drwxr-xr-x 2 root root    4096 Jan 12 14:28 locale
myfirstpool/mysecondDS          1001M 6.00G  1001M  /myfirstpool/mysecondDS
+
-rw-r--r-- 1 root root 2555597 Feb 4 11:50 unifont.pf2
 
</console>
 
</console>
  
Wow... No sign of the snapshot. What you mus know is that indeed '''zfs list''' shows only datasets by default and omits snapshots. If the command is invoked with the parameter ''-t'' set to ''all'' it will list everything:
+
=== Extlinux (Optional if you are using another bootloader) ===
 +
To install extlinux, you can follow the guide here: [[Extlinux|Link to Extlinux Guide]].
  
<console>
+
=== LILO (Optional if you are using another bootloader) ===
###i## zfs list
+
To install lilo you can type the following:
###i## zfs list -t all   
+
NAME                            USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                    1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            30K  6.00G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS          1001M  6.00G  1001M  /myfirstpool/mysecondDS
+
myfirstpool/mysecondDS@Charlie      0      -  1001M  -
+
</console>
+
 
+
So yes, ''@Charlie'' is here! Also notice here the power of copy-on-write filesystems: ''@Charlie'' takes only a couple of kilobytes (some ZFS metadata) just like any ZFS snapshot at the time they are taken. The reason snapshots occupy very little space in the datasets is because data and metadata blocks are the same and no physical copy of them are made. At the time goes on and more and more changes happens in the original dataset (''myfirstpool/mysecondDS'' here), ZFS will allocate new data and metadata blocks to accommodate the changes but will leave the blocks used by the snapshot untouched and the snapshot will tend to eat more and more pool space. It seems odd at first glance because a snapshot is a frozen in time copy of a ZFS dataset but this the way ZFS manage them. So caveat emptor: remove any unused snapshot to not full your zpool...
+
 
+
Now we have found Charlie, let's do some changes in the ''mysecondDS'':  
+
  
 
<console>
 
<console>
###i## rm -rf /myfirstpool/mysecondDS/portage/[a-h]*
+
# ##i##emerge lilo
###i## echo "Hello, world" >  /myfirstpool/mysecondDS/hello.txt
+
###i## cp /lib/firmware/radeon/* /myfirstpool/mysecondDS
+
###i## ls -l  /myfirstpool/mysecondDS
+
/myfirstpool/mysecondDS:
+
total 3043
+
-rw-r--r--  1 root root  8704 Mar  2 19:29 ARUBA_me.bin
+
-rw-r--r--  1 root root  8704 Mar  2 19:29 ARUBA_pfp.bin
+
-rw-r--r--  1 root root  6144 Mar  2 19:29 ARUBA_rlc.bin
+
-rw-r--r--  1 root root  24096 Mar  2 19:29 BARTS_mc.bin
+
-rw-r--r--  1 root root  5504 Mar  2 19:29 BARTS_me.bin
+
(...)
+
-rw-r--r--  1 root root  60388 Mar  2 19:29 VERDE_smc.bin
+
-rw-r--r--  1 root root    13 Mar  2 19:28 hello.txt
+
drwx------ 94 root root    95 Mar  2 19:28 portage
+
 
+
/myfirstpool/mysecondDS/portage:
+
total 324
+
drwxr-xr-x  16 root root  17 Oct 26 07:30 java-virtuals
+
drwxr-xr-x 303 root root  304 Jan 21 06:53 kde-base
+
drwxr-xr-x 117 root root  118 Feb 21 06:24 kde-misc
+
drwxr-xr-x  2 root root  756 Feb 23 08:44 licenses
+
drwxr-xr-x  20 root root  21 Jan  7 06:56 lxde-base
+
(...)
+
 
</console>
 
</console>
  
Now let's check again what the '''zpool''' command gives:
+
=== boot-update ===
 +
boot-update comes as a dependency of grub2, so if you already installed grub, it's already on your system!
  
<console>
+
==== Genkernel ====
###i## zfs list -t all                     
+
If your using genkernel you must add 'real_root=ZFS=<root>' and 'dozfs' to your params.
NAME                            USED  AVAIL  REFER  MOUNTPOINT
+
Example entry for <tt>/etc/boot.conf</tt>:
myfirstpool                    1.82G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            30K  6.00G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS          1005M  6.00G  903M  /myfirstpool/mysecondDS
+
myfirstpool/mysecondDS@Charlie  102M      -  1001M  -
+
</console>
+
  
Noticed the size's increase of ''myfirstpool/mysecondDS@Charlie''? This is mainly due to new files copied in the snasphot: ZFS had to retained the original blocks of data. Now time to roll  this ZFS dataset back to its original state (if some processes would have open files in the dataset to be rolled back, you should terminate them first) :
+
<pre>
 +
"Funtoo ZFS" {
 +
        kernel vmlinuz[-v]
 +
        initrd initramfs-genkernel-x86_64[-v]
 +
        params real_root=ZFS=tank/funtoo/root
 +
        params += dozfs=force
 +
        # Also add 'params += crypt_root=/dev/sda3' if you used encryption
 +
        # Adjust the above setting to your system if needed
 +
}
 +
</pre>
  
<console>
+
==== Bliss Initramfs Creator ====
###i## zfs rollback myfirstpool/mysecondDS@Charlie
+
If you used Bliss Initramfs Creator then all you need to do is add 'root=<root>' to your params.
###i## ls -l /myfirstpool/mysecondDS
+
Example entry for <tt>/etc/boot.conf</tt>:
total 6
+
drwxr-xr-x 164 root root 169 Aug 18 18:25 portage
+
</console>
+
  
Again, ZFS handled everything for you and you now have the contents of ''mysecondDS'' exactly as it was at the time the snapshot ''Charlie'' was taken. Not more complicated than that. Not illustrated here but if you look at the output given by '''zfs list -t all''' at this point you will notice that the ''Charlie'' snapshot only eat very little space. This is normal: the modified blocks have been dropped so ''myfirstpool/mysecondDS'' and its ''myfirstpool/mysecondDS@Charlie'' snapshot are the same module some metadata (hence the few kilobytes of space taken).
+
<pre>
 +
"Funtoo ZFS" {
 +
        kernel vmlinuz[-v]
 +
        initrd initrd[-v]
 +
        params root=tank/funtoo/root quiet
 +
        # If you have an encrypted device with a regular passphrase,
 +
        # you can add the following line
 +
        params += enc_root=/dev/sda3 enc_type=pass
 +
}
 +
</pre>
  
=== the .zfs pseudo-directory or the secret passage to your snapshots ===
+
After editing /etc/boot.conf, you just need to run boot-update to update grub.cfg
  
Any directory where  a ZFS dataset is mounted (having snapshots or not) secretly contains a pseudo-directory named '''.zfs''' (dot-ZFS) and you will not see it even with the option ''-a'' given to a '''ls''' command unless you specify it. It is a contradiction to Unix and Unix-like systems' philosophy to not hide anything to the system administrator. It is not a bug of ZFS On Linux implementation and the Solaris implementation of ZFS exposes the exact behavior. So what is inside this little magic box?
 
 
<console>
 
###i## cd /myfirstpool/mysecondDS
 
###i## ls -la | grep .zfs       
 
###i## ls -lad .zfs             
 
dr-xr-xr-x 1 root root 0 Mar  2 15:26 .zfs
 
</console>
 
 
<console>
 
<console>
###i## cd .zfs
+
###i## boot-update
###i## pwd
+
/myfirstpool/mysecondDS/.zfs
+
###i## ls -la
+
total 4
+
dr-xr-xr-x 1 root root  0 Mar  2 15:26 .
+
drwxr-xr-x 3 root root 145 Mar  2 19:29 ..
+
dr-xr-xr-x 2 root root  2 Mar  2 19:47 shares
+
dr-xr-xr-x 2 root root  2 Mar  2 18:46 snapshot
+
 
</console>
 
</console>
  
We will focus on the ''snapshot'' directory and since we did not dropped the ''Charlie'' snapshot (yet) let's see what lies there:
+
=== bliss-boot ===
 +
This is a new program that is designed to generate a simple, human-readable/editable, configuration file for a variety of bootloaders. It currently supports grub2, extlinux, and lilo.
  
 +
You can install it via the following command:
 
<console>
 
<console>
###i## cd snapshot
+
# ##i##emerge bliss-boot
###i## ls -l
+
total 0
+
dr-xr-xr-x 1 root root 0 Mar  2 20:16 Charlie
+
 
</console>
 
</console>
  
Yes we found Charlie here (also!), the snapshot is seen as regular directory but pay attention to its permissions:
+
==== Bootloader Configuration ====
* owning user (root) has read+execute
+
In order to generate our bootloader configuration file, we will first configure bliss-boot so that it knows what we want. The 'bliss-boot' configuration file is located in '''/etc/bliss-boot/conf.py'''. Open that file and make sure that the following variables are set appropriately:
* owning group (root) has read+execute
+
* rest of the world has read+execute
+
  
Did you notice? Not a single ''write'' permission on this directory, the only action any user can do is to enter in the directory and list its contents. This not a bug but the nature of ZFS snapshots: they are read-only stuff at the basis. Next question is naturally: can we change something in it? For that we have to enter inside the ''Charlie'' directory:
+
<pre>
 +
# This should be set to the bootloader you installed earlier: (grub2, extlinux, and lilo are the available options)
 +
bootloader = "grub2"
  
<console>
+
# This should be set to the kernel you installed earlier
###i## cd Charlie
+
default = "3.12.13-KS.02"
###i## ls -la
+
</pre>
total 7
+
drwxr-xr-x  3 root root  3 Mar  2 18:22 .
+
dr-xr-xr-x  3 root root  3 Mar  2 18:46 ..
+
drwx------ 170 root root 171 Mar  2 18:36 portage
+
</console>
+
  
No surprise here: at the time we took the snapshot, ''myfirstpool/mysecondDS'' held a copy of the portage tree stored in a directory named ''portage''. At first glance this one ''seems'' to be writable for the root user let's try to create a file in it:
+
Scroll all the way down until you find 'kernels'. You will need to add the kernels and the options
 +
you want for these kernels here. Below are a few configuration options depending if you are using
 +
'''bliss-initramfs''' or '''genkernel'''.
  
<console>
+
===== Genkernel =====
###i## cd portage
+
###i## touch test
+
touch: cannot touch ‘test’: Read-only file system
+
</console>
+
  
Thing are a bit tricky here: indeed nothing has been mounted (check with the '''mount''' command!), we are walking though a pseudo-directory exposed by ZFS that holds the ''Charlie'' snapshot. ''Pseudo-directory'' because in fact ''.zfs'' had no physical existence even in the ZFS metadata as they exists in the zpool. It is just a convenient way provided by the ZFS kernel modules to walk inside the various snapshots' content. You can see but you cannot touch :-)
+
<pre>
 +
kernel = {
 +
    '3.12.13-KS.02' : 'real_root=ZFS=tank/funtoo/root dozfs=force quiet',
 +
}
 +
</pre>
  
=== Backtracking changes between a dataset and its snapshot ===
+
'''If you are using encryption you can add the crypt_root option:'''
Is it possible to know what is the difference between a a live dataset and its snapshot? Answer to this question is '''yes''' and the '''zfs''' command will help us in this task. Now we rolled back the ''myfirstpool/mysecondDS'' ZFS dataset back to its original state we have to botch it again:
+
<console>
+
###i## cp -a /lib/firmware/radeon/C* /myfirstpool/mysecondDS
+
</console>
+
  
Now inspect the difference between the live ZFS dataset ''myfirstpool/mysecondDS'' and its snasphot Charlie, this is done via '''zfs diff''' and by giving only the snapshot's name (you can inspect the difference between snasphot with that command with a slightly change in parameters):
+
<pre>
 +
kernel = {
 +
    '3.12.13-KS.02' : 'real_root=ZFS=tank/funtoo/root dozfs=force crypt_root=/dev/sda3 quiet',
 +
}
 +
</pre>
  
<console>
+
===== Bliss Initramfs Creator =====
###i## # zfs diff myfirstpool/mysecondDS@Charlie
+
<pre>
M      /myfirstpool/mysecondDS/
+
kernel = {
+      /myfirstpool/mysecondDS/CAICOS_mc.bin
+
    '3.12.13-KS.02' : 'root=tank/funtoo/root quiet',
+      /myfirstpool/mysecondDS/CAICOS_me.bin
+
}
+      /myfirstpool/mysecondDS/CAICOS_pfp.bin
+
</pre>
+      /myfirstpool/mysecondDS/CAICOS_smc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_mc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_me.bin
+
(...)
+
</console>
+
  
So do we have here? Two things: First it shows we have changed something in ''/myfirstpool/mysecondDS'' (notice the 'M' for Modified), second it shows the addition of several files (CAICOS_mc.bin, CAICOS_me.bin, CAICOS_pfp.bin...) by putting a plus sign ('+') on their left.
+
'''If you are using encryption then you would let the initramfs know:'''
  
If we botch a bit more ''myfirstpool/mysecondDS'' by removing the file ''/myfirstpool/mysecondDS/portage/sys-libs/glibc/Manifest'' :
+
#"What type of encryption authentication you want to use? ('''enc_type=''')
 +
::* pass = will ask for passphrase directly
 +
::* key = a plain unencrypted key file
 +
::* key_gpg = an encrypted key file
 +
#"Where is the encrypted drive?" ('''enc_root=''')
 +
#"Where is the root pool after it has been decrypted?" ('''root=''')
  
<console>
+
<pre>
###i## rm /myfirstpool/mysecondDS/portage/sys-libs/glibc/Manifest
+
kernel = {
###i## zfs diff myfirstpool/mysecondDS@Charlie
+
    '3.12.13-KS.02' : 'root=tank/funtoo/root enc_root=/dev/sda3 enc_type=pass quiet',
M      /myfirstpool/mysecondDS/
+
}
M      /myfirstpool/mysecondDS/portage/sys-libs/glibc
+
</pre>
-      /myfirstpool/mysecondDS/portage/sys-libs/glibc/Manifest
+
+      /myfirstpool/mysecondDS/CAICOS_mc.bin
+
+      /myfirstpool/mysecondDS/CAICOS_me.bin
+
+      /myfirstpool/mysecondDS/CAICOS_pfp.bin
+
+      /myfirstpool/mysecondDS/CAICOS_smc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_mc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_me.bin
+
(...)
+
</console>
+
  
Obviously deleted content is marked by a minus sign ('-').
+
==== Generate the configuration ====
 +
Now that we have configure our '''/etc/bliss-boot/conf.py''' file, we can generate our config. Simply run the following command:
  
Now a real butchery:
 
 
<console>
 
<console>
###i## rm -rf /myfirstpool/mysecondDS/portage/sys-devel/gcc
+
# ##i##bliss-boot
###i## zfs diff myfirstpool/mysecondDS@Charlie
+
# zfs diff myfirstpool/mysecondDS@Charlie           
+
M      /myfirstpool/mysecondDS/
+
M      /myfirstpool/mysecondDS/portage/sys-devel
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk/fixlafiles.awk
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk/fixlafiles.awk-no_gcc_la
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/c89
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/c99
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-4.6.4-fix-libgcc-s-path-with-vsrl.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-spec-env.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-spec-env-r1.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-4.8.2-fix-cache-detection.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/fix_libtool_files.sh
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-configure-texinfo.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/gcc-4.8.1-bogus-error-with-int.patch
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.3.3-r2.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/metadata.xml
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.6.4-r2.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.6.4.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.1-r1.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.1-r2.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.6.2-r1.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.1-r3.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.2.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.1-r4.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/Manifest
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.7.3-r1.ebuild
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/gcc-4.8.2-r1.ebuild
+
M      /myfirstpool/mysecondDS/portage/sys-libs/glibc
+
-      /myfirstpool/mysecondDS/portage/sys-libs/glibc/Manifest
+
+      /myfirstpool/mysecondDS/CAICOS_mc.bin
+
+      /myfirstpool/mysecondDS/CAICOS_me.bin
+
+      /myfirstpool/mysecondDS/CAICOS_pfp.bin
+
+      /myfirstpool/mysecondDS/CAICOS_smc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_mc.bin
+
+      /myfirstpool/mysecondDS/CAYMAN_me.bin
+
(...)
+
 
</console>
 
</console>
  
No need to explain that digital mayhem! What happens if, in addition, we change the contents of the file ''/myfirstpool/mysecondDS/portage/sys-devel/autoconf/Manifest''?
+
This will generate a configuration file for the bootloader you specified previously in your current directory. You can check your config file before hand to make sure it doesn't have any errors. Simply open either: grub.cfg, extlinux.conf, or lilo.conf.
<console>
+
###i## zfs diff myfirstpool/mysecondDS@Charlie
+
M      /myfirstpool/mysecondDS/
+
M      /myfirstpool/mysecondDS/portage/sys-devel
+
M      /myfirstpool/mysecondDS/portage/sys-devel/autoconf/Manifest
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk/fixlafiles.awk
+
-      /myfirstpool/mysecondDS/portage/sys-devel/gcc/files/awk/fixlafiles.awk-no_gcc_la
+
(...)
+
</console>
+
ZFS shows that the file ''/myfirstpool/mysecondDS/portage/sys-devel/autoconf/Manifest'' has changed. So ZFS can help to track files deletion, creation and modifications. What it does not show is the difference of a file's content between as it exists in a live dataset and this dataset's snapshot. Not a big issue! You can explore a snapshot's content via the ''.zfs'' pseudo-directory and use a command like '''/usr/bin/diff''' to examine the difference with the file as it exists on the corresponding live dataset.
+
  
<console>
+
Once you have checked it for errors, place this file in the correct directory:
###i## diff -u /myfirstpool/mysecondDS/.zfs/snapshot/Charlie/portage/sys-devel/autoconf/Manifest /myfirstpool/mysecondDS/portage/sys-devel/autoconf/Manifest
+
--- /myfirstpool/mysecondDS/.zfs/snapshot/Charlie/portage/sys-devel/autoconf/Manifest  2013-08-18 08:52:01.742411902 -0400
+
+++ /myfirstpool/mysecondDS/portage/sys-devel/autoconf/Manifest 2014-03-02 21:36:50.582258990 -0500
+
@@ -4,7 +4,4 @@
+
DIST autoconf-2.62.tar.gz 1518427 SHA256 83aa747e6443def0ebd1882509c53f5a2133f50...
+
DIST autoconf-2.63.tar.gz 1562665 SHA256 b05a6cee81657dd2db86194a6232b895b8b2606a...
+
DIST autoconf-2.64.tar.bz2 1313833 SHA256 872f4cadf12e7e7c8a2414e047fdff26b517c7...
+
-DIST autoconf-2.65.tar.bz2 1332522 SHA256 db11944057f3faf229ff5d6ce3fcd819f56545...
+
-DIST autoconf-2.67.tar.bz2 1369605 SHA256 00ded92074999d26a7137d15bd1d51b8a8ae23...
+
-DIST autoconf-2.68.tar.bz2 1381988 SHA256 c491fb273fd6d4ca925e26ceed3d177920233c...
+
DIST autoconf-2.69.tar.xz 1214744 SHA256 64ebcec9f8ac5b2487125a86a7760d2591ac9e1d3...
+
(...)
+
</console>
+
  
=== Dropping a snapshot ===
+
* grub2 = /boot/grub/
A snapshot is no more than a dataset frozen in time and thus can be destroyed in the exact same way seen in the paragraphs before. Now we do not need the ''Charlie'' snapshot we can remove it. Simple:
+
* extlinux = /boot/extlinux/
<console>
+
* lilo = /etc/lilo.conf
###i## zfs destroy myfirstpool/mysecondDS@Charlie
+
###i## zfs list -t all
+
NAME                    USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool            1.71G  6.10G  850M  /myfirstpool
+
myfirstpool/myfirstDS    30K  6.10G    30K  /myfirstpool/myfirstDS
+
myfirstpool/mysecondDS  903M  6.10G  903M  /myfirstpool/mysecondDS
+
</console>
+
  
And Charlie is gone forever ;-)
+
=== LILO (Optional if you are using another bootloader) ===
 
+
Now that bliss-boot generated the lilo.conf file, move that config file to its appropriate location
=== The time travelling machine part 1: examining differences between snapshots ===
+
and install lilo to the MBR:
So far we only used a single snapshot just to keep things simple. However a dataset can hold several snapshots and you can do everything seen so far with them like rolling back, destroying them or examining the difference not only between a snapshot and its corresponding live dataset but also between two snapshots. For this part we will consider the ''myfirstpool/myfirstDS'' dataset which should be empty at this point.
+
  
 
<console>
 
<console>
# ls -la /myfirstpool/myfirstDS
+
# ##i##mv lilo.conf /etc
total 3
+
# ##i##lilo
drwxr-xr-x 2 root root 2 Mar 2 21:14 .
+
drwxr-xr-x 5 root root 6 Mar 2 17:58 ..
+
</console>
+
  
Now let's generate some contents, take a snapshot (snapshot-1), add more content, take a snapshot again (snapshot-2), do some modifications again and take a third snapshot (snapshot-3):
+
You should see the following:
  
<console>
+
Warning: LBA32 addressing assumed
###i## echo "Hello, world" > /myfirstpool/myfirstDS/hello.txt
+
Added Funtoo + *
###i## cp -R /lib/firmware/radeon /myfirstpool/myfirstDS
+
One warning was issued
###i## ls -l /myfirstpool/myfirstDS
+
total 5
+
-rw-r--r-- 1 root root 13 Mar 3 06:41 hello.txt
+
drwxr-xr-x 2 root root 143 Mar 3 06:42 radeon
+
###i## zfs snapshot myfirstpool/myfirstDS@snapshot-1
+
</console>
+
<console>
+
###i## echo "Goodbye, world" > /myfirstpool/myfirstDS/goodbye.txt
+
###i## echo "Are you there?" >> /myfirstpool/myfirstDS/hello.txt
+
###i## cp /proc/config.gz /myfirstpool/myfirstDS
+
###i## rm /myfirstpool/myfirstDS/radeon/CAYMAN_me.bin
+
###i## zfs snapshot myfirstpool/myfirstDS@snapshot-2
+
</console>
+
<console>
+
###i## echo "Still there?" >> /myfirstpool/myfirstDS/goodbye.txt
+
###i## mv /myfirstpool/myfirstDS/hello.txt /myfirstpool/myfirstDS/hello_new.txt
+
###i## cat /proc/version > /myfirstpool/myfirstDS/version.txt
+
###i## zfs snapshot myfirstpool/myfirstDS@snapshot-3
+
</console>
+
<console>
+
###i## zfs list -t all
+
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                      1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            3.04M  6.00G  2.97M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1    47K      -  2.96M  -
+
myfirstpool/myfirstDS@snapshot-2    30K      -  2.97M  -
+
myfirstpool/myfirstDS@snapshot-3      0      -  2.97M  -
+
myfirstpool/mysecondDS            1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
 
</console>
 
</console>
  
You saw to how use '''zfs diff''' to compare the difference between a snapshot and its corresponding "live" dataset in the above paragraphs. Doing the same exercise with two snapshots is not that much different as you just have to explicitly tell the command what datasets are to be compared against and the command will oputput the result in the exact same manner.So what are the differences between snapshots ''myfirstpool/myfirstDS@snapshot-1'' and ''myfirstpool/myfirstDS@snapshot-2''? Let's make the '''zfs''' command work for us:
+
== Create the initramfs ==
 +
There are two ways to do this, you can use "genkernel" or "bliss-initramfs". Both will be shown.
  
 +
=== genkernel ===
 +
Install genkernel and run it:
 
<console>
 
<console>
###i## zfs diff myfirstpool/myfirstDS@snapshot-1 myfirstpool/myfirstDS@snapshot-2
+
# ##i##emerge genkernel
M      /myfirstpool/myfirstDS/
+
M      /myfirstpool/myfirstDS/hello.txt
+
M      /myfirstpool/myfirstDS/radeon
+
-      /myfirstpool/myfirstDS/radeon/CAYMAN_me.bin
+
+      /myfirstpool/myfirstDS/goodbye.txt
+
+      /myfirstpool/myfirstDS/config.gz
+
</console>
+
  
Before digging farther, let's think about what we did between the time we created the first snapshot and the second snapshot:
+
You only need to add --luks if you used encryption
* We modified the file /myfirstpool/myfirstDS/hello.txt hence the 'M' shown on left of the second line (thus we changed something under ''/myfirstpool/myfirstDS'' hence a 'M' is also shown on the left of the first line)
+
# ##i##genkernel --zfs --luks initramfs
* We deleted the file ''/myfirstpool/myfirstDS/radeon/CAYMAN_me.bin'' hence the minus sign ('-') shown on the left of the fourth line (and the 'M' shown on left of the third line)
+
* We added two files which were ''/myfirstpool/myfirstDS/goodbye.txt'' and ''/myfirstpool/myfirstDS/config.gz'' hence the plus sign ('+') shown on the left of the fifth and sixth lines (also this is a change happening in ''/myfirstpool/myfirstDS'' hence another reason to show a 'M' on the left of the first line)
+
 
+
Now same exercise this time with snapshots ''myfirstpool/myfirstDS@snapshot-2'' and ''myfirstpool/myfirstDS@snapshot-3'':
+
 
+
<console>
+
###i## zfs diff myfirstpool/myfirstDS@snapshot-2 myfirstpool/myfirstDS@snapshot-3
+
M      /myfirstpool/myfirstDS/
+
R      /myfirstpool/myfirstDS/hello.txt -> /myfirstpool/myfirstDS/hello_new.txt
+
M      /myfirstpool/myfirstDS/goodbye.txt
+
+      /myfirstpool/myfirstDS/version.txt
+
 
</console>
 
</console>
  
Try to interpret what you see except for the second line where a "R" (standing for "Rename") is shown. ZFS is smart enough to also show both the old the new names!
+
=== Bliss Initramfs Creator ===
 
+
If you are encrypting your drives, then add the "luks" use flag to your package.use before emerging:
Why not push the limit and try a few fancy things. First things first: what happens if we tell to compare two snapshots but in a reverse order?
+
  
 
<console>
 
<console>
###i## zfs diff myfirstpool/myfirstDS@snapshot-3 myfirstpool/myfirstDS@snapshot-2
+
# ##i##echo "sys-kernel/bliss-initramfs luks" >> /etc/portage/package.use
Unable to obtain diffs:
+
  Not an earlier snapshot from the same fs
+
 
</console>
 
</console>
  
Is ZFS would be a bit more happy if we ask the difference between two snapshots this time with a gap in between (so snapshot 1 with snapshot 3):
+
Now install the program and run it:
 
+
 
<console>
 
<console>
###i## zfs diff myfirstpool/myfirstDS@snapshot-1 myfirstpool/myfirstDS@snapshot-3
+
# ##i##emerge bliss-initramfs
M      /myfirstpool/myfirstDS/
+
R      /myfirstpool/myfirstDS/hello.txt -> /myfirstpool/myfirstDS/hello_new.txt
+
M      /myfirstpool/myfirstDS/radeon
+
-      /myfirstpool/myfirstDS/radeon/CAYMAN_me.bin
+
+      /myfirstpool/myfirstDS/goodbye.txt
+
+      /myfirstpool/myfirstDS/config.gz
+
+      /myfirstpool/myfirstDS/version.txt
+
</console>
+
  
Amazing! Here again, take a couple of minutes to think about all operations you did on the dataset between the time you took the first snapshot and the time you took the last snapshot: this summary is the exact reflect of all your previous operations.
+
You can either run it without any parameters to get an interactive menu
 
+
or you can pass the parameters directly. 1 = zfs, 6 = encrypted zfs, and the kernel name.
Just to put a conclusion on this subject, let's see the differences between the ''myfirstpool/myfirstDS'' dataset and its various snapshots:
+
# ##i##bliss-initramfs 1 3.12.13-KS.02
 
+
<console>
+
###i## zfs diff myfirstpool/myfirstDS@snapshot-1                                
+
M      /myfirstpool/myfirstDS/
+
R      /myfirstpool/myfirstDS/hello.txt -> /myfirstpool/myfirstDS/hello_new.txt
+
M      /myfirstpool/myfirstDS/radeon
+
-       /myfirstpool/myfirstDS/radeon/CAYMAN_me.bin
+
+      /myfirstpool/myfirstDS/goodbye.txt
+
+      /myfirstpool/myfirstDS/config.gz
+
+      /myfirstpool/myfirstDS/version.txt
+
</console>
+
<console>
+
###i## zfs diff myfirstpool/myfirstDS@snapshot-2
+
M      /myfirstpool/myfirstDS/
+
R      /myfirstpool/myfirstDS/hello.txt -> /myfirstpool/myfirstDS/hello_new.txt
+
M      /myfirstpool/myfirstDS/goodbye.txt
+
+      /myfirstpool/myfirstDS/version.txt
+
</console>
+
<console>
+
###i##  zfs diff myfirstpool/myfirstDS@snapshot-3
+
 
</console>
 
</console>
  
Having nothing reported for the last '''zfs diff''' is normal as changed in the dataset since the snapshot has been taken.
+
=== Moving into the correct location ===
 
+
Place the file that was generated by the above applications into either your /boot folder (If you are using boot-update) or into your /boot/kernels/3.12.13-KS.02 folder (If you are using bliss-boot). For bliss-boot, the file needs to be called 'initrd' rather than 'initrd-3.12.13-KS.02'.
=== The time travelling machine part 2: rolling back with multiple snapshots ===
+
Examining the differences between the various snapshots of a dataset or the dataset itself would be quite useless if we would not be able to roll the dataset back to one of its previous states. How we have salvaged ''myfirstpool/myfirstDS'' a bit, it would the time to restore it at it was when the first snapshot had been taken:
+
  
 +
==== boot-update ====
 
<console>
 
<console>
###i## zfs rollback myfirstpool/myfirstDS@snapshot-1
+
# ##i##mv initrd-3.12.13-KS.02 /boot
cannot rollback to 'myfirstpool/myfirstDS@snapshot-1': more recent snapshots exist
+
use '-r' to force deletion of the following snapshots:
+
myfirstpool/myfirstDS@snapshot-3
+
myfirstpool/myfirstDS@snapshot-2
+
 
</console>
 
</console>
  
Err... Well, ZFS just tells us that several more recent snapshots exists and it refuses to proceed without dropping those latter. Unfortunately for us there is no way to circumvent that: once you jump backward you have no way to move forward again. We could demonstrate the rollback to ''myfirstpool/myfirstDS@snapshot-3'' then ''myfirstpool/myfirstDS@snapshot-2'' then ''myfirstpool/myfirstDS@snapshot-1'' but it would be of very little interest previous sections of this tutorial did that already so second attempt:
+
==== bliss-boot ====
 
+
 
<console>
 
<console>
###i## zfs rollback -r myfirstpool/myfirstDS@snapshot-1
+
# ##i##mv initrd-3.12.13-KS.02 /boot/kernels/3.12.13-KS.02/initrd
###i## zfs list -t all                                                         
+
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                      1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            2.96M  6.00G  2.96M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1    1K      -  2.96M  -
+
myfirstpool/mysecondDS            1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
 
</console>
 
</console>
  
''myfirstpool/myfirstDS'' effectively returned to the desired state (notice the size of ''myfirstpool/myfirstDS@snapshot-1'') and the snapshots ''snapshot-2'' and ''snapshot-3'' vanished. Just to convince you:
+
== Final configuration ==
<console>
+
=== Add the zfs tools to openrc ===
###i## zfs diff myfirstpool/myfirstDS@snapshot-1
+
<console># ##i##rc-update add zfs boot</console>
###i##
+
</console>
+
  
No differences at all!
+
=== Clean up and reboot ===
 
+
We are almost done, we are just going to clean up, '''set our root password''', and unmount whatever we mounted and get out.
=== Snapshots and clones ===
+
 
+
A clone and a snapshot are two very close things in ZFS:
+
 
+
* A clone appears as mounted dataset (i.e. you can read and write data in it) while a snapshot stays apart and is always read-only
+
* A clone is always spawned from a snapshot
+
 
+
So it is absolutely true to say that a clone is just indeed a writable snapshot. The copy-on-write feature of ZFS plays its role even there: the data blocks hold by the snapshot are only duplicated upon modification. So cloning 20Gb snapshot of data does not lead to an additional 20 Gb of data being eaten from the pool.
+
 
+
How to make a clone? Simple, once again with the '''zfs''' command used like this:
+
  
 
<console>
 
<console>
###i## zfs clone myfirstpool/myfirstDS@snapshot-1 myfirstpool/myfirstDS_clone1
+
Delete the stage3 tarball that you downloaded earlier so it doesn't take up space.
###i## fs list -t all
+
# ##i##cd /
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
# ##i##rm stage3-latest.tar.xz
myfirstpool                      1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            2.96M  6.00G  2.96M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1    1K      -  2.96M  -
+
myfirstpool/myfirstDS_clone1        1K  6.00G  2.96M  /myfirstpool/myfirstDS_clone1
+
myfirstpool/mysecondDS            1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
</console>
+
  
Noticed the value of ''MOUNTPOINT'' for ''myfirstpool/myfirstDS_clone1''? No we have a dataset that is mounted! Let's check with the '''mount''' command:
+
Set your root password
 +
# ##i##passwd
 +
>> Enter your password, you won't see what you are writing (for security reasons), but it is there!
  
<console>
+
Get out of the chroot environment
###i## mount | grep clone
+
# ##i##exit
myfirstpool/myfirstDS_clone1 on /myfirstpool/myfirstDS_clone1 type zfs (rw,xattr)
+
</console>
+
  
In theory we can change or write additional data in the clone as it is mounted as being writable (rw). Let it be!
+
Unmount all the kernel filesystem stuff and boot (if you have a separate /boot)
 +
# ##i##umount -l proc dev sys boot
  
<console>
+
Turn off the swap
###i## # ls /myfirstpool/myfirstDS_clone1
+
# ##i##swapoff /dev/zvol/tank/swap
hello.txt  radeon
+
</console>
+
<console>
+
###i## cp /proc/config.gz /myfirstpool/myfirstDS_clone1
+
###i## echo 'This is a clone!' >> /myfirstpool/myfirstDS_clone1/hello.txt
+
</console>
+
<console>
+
###i## ls /myfirstpool/myfirstDS_clone1
+
config.gz  hello.txt  radeon
+
###i## cat /myfirstpool/myfirstDS_clone1/hello.txt                     
+
Hello, world
+
This is a clone!
+
</console>
+
  
Unfortunately it is not possible to ask the difference between a clone and a snapshot, '''zfs diff''' expects to see either a snapshot name either two snapshots names. Once spawned, a clone starts its own existence and the clone that served as a seed for it remains attached to its own original dataset.
+
Export the zpool
 +
# ##i##cd /
 +
# ##i##zpool export tank
  
Because clones are nothing more than a ZFS dataset they can be destroyed just like any ZFS dataset:
+
Reboot
 
+
# ##i##reboot
<console>
+
###i## zfs destroy myfirstpool/myfirstDS_clone1
+
###i## zfs list -t all                                                       
+
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                      1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            2.96M  6.00G  2.96M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1    1K      -  2.96M  -
+
myfirstpool/mysecondDS            1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
 
</console>
 
</console>
  
=== Streaming ZFS datasets ===
+
{{fancyimportant|'''Don't forget to set your root password as stated above before exiting chroot and rebooting. If you don't set the root password, you won't be able to log into your new system.'''}}
A ZFS snapshot can not only be cloned or explored but also streamed in a local file or even over the network thus allowing to back up or simply an exact bit to bit copy of a ZFS dataset between two machines for example. Snapshots being differential (i.e. incremental) by nature very little network overhead is induced when consecutive snapshots are streamed over the network. A nifty move from the designers was to use ''stdin'' and ''stdout'' as transmission/reception channels thus allowing great a flexibility in processing the ZFS stream. You can envisage, for instance, to compress your stream then crypt it then encode it in base64 then sign it and so on. It sounds a bit overkill but it is possible and in the general case you can use any tool that swallows the data from ''stdin'' and spit it through ''stdout'' in your plumbing.
+
  
First things first, just to illustrate some basic concepts here is how to stream a ZFS dataset snapshot to a local file:
+
and that should be enough to get your system to boot on ZFS.
  
<console>
+
== After reboot ==
###i## zfs send myfirstpool/myfirstDS@snapshot-1 > /tmp/myfirstpool-myfirstDS@snapshot-snap1
+
###i## cat /tmp/myfirstpool-myfirstDS@snapshot-snap1 | zfs receive myfirstpool/myfirstDS@testrecv
+
</console>
+
  
Now let's stream it back:
+
=== Forgot to reset password? ===
 +
==== System Rescue CD ====
 +
If you aren't using bliss-initramfs, then you can reboot back into your sysresccd and reset through there by mounting your drive, chrooting, and then typing passwd.
  
 +
Example:
 
<console>
 
<console>
###i## cannot receive new filesystem stream: destination 'myfirstpool/myfirstDS' exists
+
# ##i##zpool import -f -R /mnt/funtoo tank
must specify -F to overwrite it
+
# ##i##chroot /mnt/funtoo bash -l
 +
# ##i##passwd
 +
# ##i##exit
 +
# ##i##zpool export -f tank
 +
# ##i##reboot
 
</console>
 
</console>
  
Ouch... ZFS refuses to go any step further because some data would be overwritten.  We do now own any critical data on the dataset so we could destroy it and try again or use a different name nevertheless, just for the sake of the demonstration, let's create another zpool prior restoring the dataset there:
+
==== Using bliss-initramfs ====
 +
If you forgot to reset your password and are using '''bliss-initramfs''', you can add the '''su''' option to your bootloader parameters and the initramfs will throw you into the rootfs of your drive. In there you can run 'passwd' and then type 'exit'. Once you type 'exit', the initramfs will continue to boot your system as normal.
  
<console>
+
=== Create initial ZFS Snapshot ===
###i## dd if=/dev/zero of=/tmp/zfs-test-disk04.img bs=2G count=1
+
Continue to set up anything you need in terms of /etc configurations. Once you have everything the way you like it, take a snapshot of your system. You will be using this snapshot to revert back to this state if anything ever happens to your system down the road. The snapshots are cheap, and almost instant.  
0+1 records in
+
0+1 records out
+
2147479552 bytes (2.1 GB) copied, 6.35547 s, 338 MB/s
+
###i## losetup -f           
+
/dev/loop4
+
###i## losetup /dev/loop4 /tmp/zfs-test-disk04.img
+
###i## zpool create testpool /dev/loop4
+
###i## zpool list
+
NAME          SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
+
myfirstpool  7.94G  1.81G  6.12G    22%  1.00x  ONLINE  -
+
testpool    1.98G  89.5K  1.98G    0%  1.00x  ONLINE  -
+
</console>
+
  
Take two:
+
To take the snapshot of your system, type the following:
 +
<console># ##i##zfs snapshot -r tank@install</console>
  
<console>
+
To see if your snapshot was taken, type:
###i## cat /tmp/myfirstpool-myfirstDS@snapshot-snap1 | zfs receive testpool/myfirstDS@testrecv
+
<console># ##i##zfs list -t snapshot</console>
###i## zfs list -t all
+
NAME                              USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                      1.81G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS            2.96M  6.00G  2.96M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1    1K      -  2.96M  -
+
myfirstpool/mysecondDS            1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
testpool                          3.08M  1.95G    31K  /testpool
+
testpool/myfirstDS                2.96M  1.95G  2.96M  /testpool/myfirstDS
+
testpool/myfirstDS@testrecv          0      -  2.96M  -
+
</console>
+
  
Very interesting things happened there! First the data previously stored in the file ''/tmp/myfirstpool-myfirstDS@snapshot-snap1'' been copied as a snapshot in the destination zpool (''testpool'' here) and it has been copied exactly in the same manner given on the command line. Second a clone of this snapshot has been crated for you by ZFS and the snapshot ''myfirstpool/myfirstDS@snapshot-1'' now appears as a live ZFS dataset where data can be read and
+
If your machine ever fails and you need to get back to this state, just type (This will only revert your / dataset while keeping the rest of your data intact):
written! Think two seconds about the error message we got just above, the reason ZFS protested becomes clear now.
+
<console># ##i##zfs rollback tank/funtoo/root@install</console>
  
An alternative would have been to use the original zpool but this time with a different name for the dataset:
+
{{fancyimportant|'''For a detailed overview, presentation of ZFS' capabilities, as well as usage examples, please refer to the [[ZFS_Fun|ZFS Fun]] page.'''}}
  
<console>
+
== Troubleshooting ==
###i## cat /tmp/myfirstpool-myfirstDS@snapshot-snap1 | zfs receive myfirstpool/myfirstDS_copy@testrecv
+
###i## zfs list -t all                                                                               
+
NAME                                  USED  AVAIL  REFER  MOUNTPOINT
+
myfirstpool                          1.82G  6.00G  850M  /myfirstpool
+
myfirstpool/myfirstDS                2.96M  6.00G  2.96M  /myfirstpool/myfirstDS
+
myfirstpool/myfirstDS@snapshot-1        1K      -  2.96M  -
+
myfirstpool/myfirstDS_copy          2.96M  6.00G  2.96M  /myfirstpool/myfirstDS_copy
+
myfirstpool/myfirstDS_copy@testrecv      0      -  2.96M  -
+
myfirstpool/mysecondDS              1003M  6.00G  1003M  /myfirstpool/mysecondDS
+
</console>
+
  
Now something a bit  more interesting: instead of using a local file, we will stream the dataset to a Solaris 11 machine (OpenIndiana can be used also) over the network using '''netcat''' (''net-analyzer/netcat'') over the port TCP/7000 , in that case the Solaris host is a x86 machine but a SPARC machine would have given the exact same result as ZFS contrary to UFS is platform agnostic.
+
=== Starting from scratch ===
 +
If your installation has gotten screwed up for whatever reason and you need a fresh restart, you can do the following from sysresccd to start fresh:
  
On the Solaris machine:
 
 
<console>
 
<console>
###i## nc -l -p 7000 | zfs receive nas/zfs-stream-test@s1
+
Destroy the pool and any snapshots and datasets it has
</console>
+
# ##i##zpool destroy -R -f tank
  
On the Linux machine:
+
This deletes the files from /dev/sda1 so that even after we zap, recreating the drive in the exact sector
<console>
+
position and size will not give us access to the old files in this partition.
###i## zfs send myfirstpool/myfirstDS@snapshot-1 | nc 192.168.1.13 7000
+
# ##i##mkfs.ext2 /dev/sda1
 +
# ##i##sgdisk -Z /dev/sda
 
</console>
 
</console>
  
{{Fancyimportant|As writing we found an issue: '''zfs send''' seems to never return so the '''nc''' command waits forever... A Ctrl-C must be sent manually... }}
+
Now start the guide again :).
 
+
After the dataset has been received on the Solaris machine the ''nas'' zpool now contains the sent snapshot and its corresponding clone, that latter being  automatically created:<console>
+
###i## zfs list -t snapshot
+
NAME                                          USED  AVAIL  REFER  MOUNTPOINT
+
(...)
+
nas/zfs-stream-test                          3.02M  6.17T  3.02M  /nas/zfs-stream-test
+
nas/zfs-stream-test@s1                          0      -  3.02M  -
+
</console>
+
 
+
{{Fancynote|We took only a simple case here: ZFS can is able to handle snapshots is a very flexible way. You can ask, for example, to combine several consecutive snapshots then send them as a single snapshot or you can choose to proceed in incremental steps. A '''man zfs''' will tell you the art of streaming your snapshots.}}
+
 
+
== Govern a dataset by attributes ==
+
 
+
So far, most of a filesystem capabilities were driven by separate and scarced command line line tools (e.g. tune2fs, edquota, rquota, quotacheck...) which all have their own ways to handle tasks and can go through tricky ways sometimes especially the quota-related management utilities. Moreover, there was no easy way to handle a limitations on a directory rather than putting it a a dedicated partition or logical volume implying downtimes when additional space was to be added. Quota management is however one of the many facets disk space management includes.
+
 
+
In the ZFS world, many aspects are now managed by simply setting/clearing a property attached to a ZFS dataset through the now so well-known command '''zfs'''.You can, for example:
+
 
+
* put a size limit on a dataset
+
* reserve a space for dataset (that space is ''guaranteed'' to be available in the future although not being allocated at the time the reservation is made)
+
* control if new files are encrypted and/or compressed
+
* define a quota per user or group of users
+
* control checksum usage  => '''never turn that property off unless having very good reasons you are likely to never have''' (no checksums = no silent data corruption detection)
+
* share a dataset by NFS/CIFS
+
* control automatic data deduplication
+
 
+
Not all of a dataset properties are settable, some of them are set and managed by the operating system in the background for you and thus cannot be modified.
+
 
+
{{fancynote|Solaris/OpenIndiana users: ZFS has a tight integration with the NFS/CIFS server, thus it is possible to share a zfs dataset by setting adequate attributes. ZFS on Linux (native kernel mode port) also has a tight integration with the built-in Linux NFS server, the same for ZFS fuse although still experimental. Under FreeBSD ZFS integration has been done both with NFS and Samba (CIFS).}}
+
 
+
Like any other action concerning datasets, properties are sets and unset via the zfs command. On our Funtoo box running zfs-Fuse we can, for example, start by seeing the value of all properties for the dataset ''myfirstpool/myfirstDS'':
+
 
+
<pre>
+
# zfs get all myfirstpool/myfirstDS
+
zfs get all myfirstpool/myfirstDS
+
NAME                  PROPERTY              VALUE                  SOURCE
+
myfirstpool/myfirstDS  type                  filesystem              -
+
myfirstpool/myfirstDS  creation              Sun Sep  4 23:34 2011  -
+
myfirstpool/myfirstDS  used                  73.8M                  -
+
myfirstpool/myfirstDS  available            5.47G                  -
+
myfirstpool/myfirstDS  referenced            73.8M                  -
+
myfirstpool/myfirstDS  compressratio        1.00x                  -
+
myfirstpool/myfirstDS  mounted              yes                    -
+
myfirstpool/myfirstDS  quota                none                    default
+
myfirstpool/myfirstDS  reservation          none                    default
+
myfirstpool/myfirstDS  recordsize            128K                    default
+
myfirstpool/myfirstDS  mountpoint            /myfirstpool/myfirstDS  default
+
myfirstpool/myfirstDS  sharenfs              off                    default
+
myfirstpool/myfirstDS  checksum              on                      default
+
myfirstpool/myfirstDS  compression          off                    default
+
myfirstpool/myfirstDS  atime                on                      default
+
myfirstpool/myfirstDS  devices              on                      default
+
myfirstpool/myfirstDS  exec                  on                      default
+
myfirstpool/myfirstDS  setuid                on                      default
+
myfirstpool/myfirstDS  readonly              off                    default
+
myfirstpool/myfirstDS  zoned                off                    default
+
myfirstpool/myfirstDS  snapdir              hidden                  default
+
myfirstpool/myfirstDS  aclmode              groupmask              default
+
myfirstpool/myfirstDS  aclinherit            restricted              default
+
myfirstpool/myfirstDS  canmount              on                      default
+
myfirstpool/myfirstDS  xattr                on                      default
+
myfirstpool/myfirstDS  copies                1                      default
+
myfirstpool/myfirstDS  version              4                      -
+
myfirstpool/myfirstDS  utf8only              off                    -
+
myfirstpool/myfirstDS  normalization        none                    -
+
myfirstpool/myfirstDS  casesensitivity      sensitive              -
+
myfirstpool/myfirstDS  vscan                off                    default
+
myfirstpool/myfirstDS  nbmand                off                    default
+
myfirstpool/myfirstDS  sharesmb              off                    default
+
myfirstpool/myfirstDS  refquota              none                    default
+
myfirstpool/myfirstDS  refreservation        none                    default
+
myfirstpool/myfirstDS  primarycache          all                    default
+
myfirstpool/myfirstDS  secondarycache        all                    default
+
myfirstpool/myfirstDS  usedbysnapshots      18K                    -
+
myfirstpool/myfirstDS  usedbydataset        73.8M                  -
+
myfirstpool/myfirstDS  usedbychildren        0                      -
+
myfirstpool/myfirstDS  usedbyrefreservation  0                      -
+
myfirstpool/myfirstDS  logbias              latency                default
+
myfirstpool/myfirstDS  dedup                off                    default
+
myfirstpool/myfirstDS  mlslabel              off                    -
+
</pre>
+
 
+
How can we set a limit that prevents ''myfirstpool/myfirstDS'' to not use more than 1 GB of space in the pool? Simple, just set the ''quota'' property:
+
 
+
<pre>
+
# zfs set quota=1G myfirstpool/myfirstDS
+
# zfs get quota myfirstpool/myfirstDS
+
NAME                  PROPERTY  VALUE  SOURCE
+
myfirstpool/myfirstDS  quota    1G    local
+
</pre>
+
 
+
May be something poked your curiosity: ''what "SOURCE" means?'' "SOURCE" describes how the property has been determined for the dataset and can have several values:
+
* '''local''': the property has been explicitly set for this dataset
+
* '''default''': a default value has been assigned by the operating system if not explicitely set by the system adminsitrator (e.g SUID allowed or not in the above example).
+
* '''dash (-)''': not modifiable intrinsic property (e.g. dataset creation time, whether the dataset is currently mounted or not, dataset space usage in the pool, average compression ratio...)
+
 
+
Before copying some files in the dataset, let's fix a binary (on/off) property:
+
<pre>
+
# zfs set compression=on myfirstpool/myfirstDS
+
</pre>
+
 
+
Now try to put more than 1GB of data in the dataset:
+
 
+
<pre>
+
# dd if=/dev/zero of=/myfirstpool/myfirstDS/one-GB-test bs=2G count=1
+
dd: writing `/myfirstpool/myfirstDS/one-GB-test': Disk quota exceeded
+
</pre>
+
 
+
== Permission delegation ==
+
 
+
ZFS brings a feature known as delegated administration. Delegated administration enables ordinary users to handle administrative tasks on a dataset without being administrators. '''It is however not a sudo replacement as it covers only ZFS related tasks''' such as sharing/unsharing, disk quota management and so on. Permission delegation shines in flexibility because such delegation can be handled by inheritance though nested datasets. Pewrmission deleguation is handled via '''zfs''' through its '''allow''' and '''disallow''' options.
+
 
+
= Data redundancy with ZFS =
+
 
+
Nothing is perfect and the storage medium (even in datacenter-class equipment) is prone to failures and fails on a regular basis. Having data redundancy is mandatory to help in preventing single-points of failure (SPoF). Over the past decades, RAID technologies were powerful however their power is precisely their weakness: as operating at the block level, they do not care about what is stored on the data blocks and have no ways to interact with the filesystems stored on them to ensure data integrity is properly handled.
+
 
+
== Some statistics ==
+
 
+
It is not a secret to tell that a general trend in the IT industry is the exponential growth of data quantities. Just thinking about the amount of data Youtube, Google or Facebook generates every day taking the case of the first [http://www.website-monitoring.com/blog/2010/05/17/youtube-facts-and-figures-history-statistics some statistics] gives:
+
* 24 hours of video is generated every ''minute'' in March 2010 (May 2009 - 20h / October 2008 - 15h / May 2008 - 13h)
+
* More than 2 ''billions'' views a day
+
* More video is produced on Youtube every 60 days than 3 major US broadcasting networks did in the last 60 years
+
 
+
Facebook is also impressive (Facebook own stats):
+
 
+
* over 900 million objects that people interact with (pages, groups, events and community pages)
+
* Average user creates 90 pieces of content each month (750 millions users active)
+
* More than 2.5 million websites have integrated with Facebook
+
 
+
What is true with Facebook and Youtube is also true with many other cases (think one minutes about the amount of data stored in iTunes) especially with the growing popularity of cloud computing infrastructures. Despite the progress of the technology a "bottleneck" still exists: the storage reliability is nearly the same over the years. If only one organization in the world generate huge quantities of data it would be the [http://public.web.cern.ch CERN] (''Conseil Européen pour la Recherche Nucléaire'', now officially known as ''European Organization for Nuclear Research'') as their experiments can generate spikes of many terabytes of data within a few seconds. A study done in 2007 quoted by a [http://www.zdnet.com/blog/storage/data-corruption-is-worse-than-you-know/191 ZDNet article] reveals that:
+
 
+
* Even ECC memory cannot be always be helpful: 3 double-bit errors (uncorrectable) occurred in 3 months on 1300 nodes. Bad news: it should be '''zero'''.
+
* RAID systems cannot protect in all cases: monitoring 492 RAID controller for 4 weeks showed an average error rate of 1 per ~10^14 bits, giving roughly 300 errors for every 2.4 petabytes
+
* Magnetic storage is still not reliable even on high-end datacenter class drives: 500 errors found over 100 nodes while writing 2 GB file to 3000+ nodes every 2 hours then read it again and again for 5 weeks.
+
 
+
Overall this means: 22 corrupted files (1 in every 1500 files) for a grand total of 33700 files holding 8.7TB of data. And this study is 5 years old....
+
 
+
== Source of silent data corruption ==
+
 
+
http://www.zdnet.com/blog/storage/50-ways-to-lose-your-data/168
+
 
+
Not an exhaustive list but we can quote:
+
 
+
* Cheap controller or buggy driver that does not reports errors/pre-failure conditions to the operating system;
+
* "bit-leaking": an harddrive consists of many concentric magnetic tracks. When the hard drive magnetic head writes bits on the magnetic surface it generates a very weak magnetic field however sufficient to "leak" on the next track and change some bits. Drives can generally, compensate those situations because they also records some error correction data on the magnetic surface
+
* magnetic surface defects (weak sectors)
+
* Hard drives firmware bugs
+
* Cosmic rays hitting your RAM chips or hard drives cache memory/electronics
+
*
+
 
+
== Building a mirrored pool ==
+
 
+
 
+
== ZFS RAID-Z ==
+
 
+
=== ZFS/RAID-Z vs RAID-5 ===
+
 
+
RAID-5 is very commonly used nowadays because of its simplicity, efficiency and fault-tolerance. Although the technology did its proof over decades, it has a major drawback known as "The RAID-5 write hole". if you are familiar with RAID-5 you already know that is consists of spreading the stripes across all of the disks within the array and interleaving them with a special stripe called the parity. Several schemes of spreading stripes/parity between disks exists in the natures, each one with its own pros and cons, however the "standard" one (also known as ''left-asynchronous'') is:
+
 
+
<pre>
+
Disk_0  | Disk_1  | Disk_2  | Disk_3
+
[D0_S0] | [D0_S1] | [D0_S2] | [D0_P]
+
[D1_S0] | [D1_S1] | [D1_P]  | [D1_S2]
+
[D2_S0] | [D2_P]  | [D2_S1] | [D2_S2]
+
[D2_P]  | [D2_S0] | [D2_S1] | [D2_S2]
+
</pre>
+
 
+
The parity is simply computed by XORing the stripes of the same "row", thus giving the general equation:
+
* [Dn_S0] XOR [Dn_S1] XOR ... XOR [Dn_Sm] XOR [Dn_P] = 0
+
This equation can be rewritten in several ways:
+
* [Dn_S0] XOR [Dn_S1] XOR ... XOR [Dn_Sm] = [Dn_P]
+
* [Dn_S1] XOR [Dn_S2] XOR ... XOR [Dn_Sm] XOR [Dn_P] = [Dn_S0]
+
* [Dn_S0] XOR [Dn_S2] XOR ... XOR [Dn_Sm] XOR [Dn_P] = [Dn_S1]
+
* ...and so on!
+
 
+
Because the equations are a combinations of exclusive-or, it is  possible to easily compute a parameter if it is missing. Let say we have 3 stripes plus one parity composed of 4 bits each but one of them is missing due to a disk failure:
+
 
+
* D0_S0 = 1011
+
* D0_S1 = 0010
+
* D0_S2 = <missing>
+
* D0_P  = 0110
+
 
+
However we know that:
+
* D0_S0 XOR D0_S1 XOR D0_S2 XOR D0_P = 0000 also rewritten as:
+
* D0_S2 = D0_S1 XOR D0_S2 XOR D0_P
+
 
+
Applying boolean algebra it gives:''' D0_S2 = 1011 XOR 0010 XOR 0110 = 1111'''.
+
Proof: '''1011 XOR 0010 XOR 1111 = 0110''' this is the same as '''D0_P'''
+
 
+
''''''So what's the deal?''''''
+
Okay now the funny part, forgot the above hypothesis and imagine we have this:
+
 
+
* D0_S0 = 1011
+
* D0_S1 = 0010
+
* D0_S2 = 1101
+
* D0_P  = 0110
+
 
+
Applying boolean algebra magics gives 1011 XOR 0010 XOR 1101 => 0100. Problem: this is different of D0_P  (0110). Can you tell which one (or which ONES) of the four terms lies? If you find a mathematically acceptable solution, found your company because you have just solved a big computer science problem. If humans can't solve the question, imagine how hard it is for the poor little RAID-5 controller to determine which stripe is right and which one lies and the resulting "datageddon" (i.e. massive data corruption on the RAID-5 array) when the RAID-5 controller detect error and start to rebuild the array.
+
 
+
This is not science fiction, this a pure reality and the weakness stays in the RAID-5 simplicity. Here is how it can happen: an urban legend with RAID-5 arrays is that they update stripes in an atomic transaction (all of the stripes+parity are written or none of them). Too bad, this is just not true, the data is written on the fly and if for a reason or another the machine where the RAID-5 array has a power outage or crash, the RAID-5 controller will simply have no idea about what he was doing and which stripes are up to date which ones are not up to date. Of course, RAID controllers in servers do have a replaceable on-board battery and most of the time the server they reside in is connected to an auxiliary source like a battery-based UPS or a diesel/gas electricity generator. However, Murphy laws or unpredictable hazards can, sometimes, happens....
+
 
+
Another funny scenario: imagine a machine with a RAID-5 array (on UPS this time) but with non ECC memory. the RAID-5 controller splits the data buffer in stripes, computes a data stripe and starts to write them on the different disks of the array. But...but...but... For some odd reason, only one bit in one of the stripes flips (cosmic rays, RFI...) after the parity calculation. Too bad too sad, one of the written stripes contains corrupted data and it is silently written on the array. Datageddon in sight!
+
 
+
Not to make you freaking: storage units have sophisticated error correction capability (a magnetic surface or an optical recording surface is not perfect and reading/writing error occurs) masking most the cases. However, some  established statistics estimates that even with error correction mechanism one bit over 10^16 bits transferred is incorrect. 10^16 is really huge but unfortunately in this beginning of the XXIst century with datacenters brewing massive amounts of data with several hundreds to not say thousands servers this this number starts to give headaches:  '''a big datacenter can face to silent data corruption every 15 minutes''' (Wikepedia). No typo here, a potential disaster may silently appear 5 times an hour for every single day of the year. Detection techniques exists but traditional RAID-5 arrays in them selves can be a problem. Ironic for a so popular and widely used solution :)
+
 
+
If RAID-5 was an acceptable trade-off in the past decades, it simply made its time.  RAID-5 is dead? '''*Horray!*'''
+
 
+
= More advanced topics =
+
 
+
== ZFS Intention Log (ZIL) ==
+
 
+
= Final words and lessons learned =
+
+
ZFS surpasses by far (as of September 2011) every of the well-known filesystems around there: none of them propose such an integration of features and certainly not with this management simplicity and robustness. However in the Linux world it is definitely a no-go in the short term especially for production systems. The two known implementations are not ready for production environments  and lacks some important features or behave in a clunky manner, this is absolutely correct as none of them pretend to be at this level of maturity and the licensing incompatibility between the code opened by Sun Microsystems some years ago and the GNU/GPL does not help the cause. However, both look '''very promising''' once their corners will become rounded.
+
 
+
For a Linux system, the nearest plan B is you seek for a BTRFS like filesystem covering some of the functionalities offered by ZFS is BTRFS (still considered as experimental, be prepared to a disaster sooner or later although BTRFS is used by some Funtoo core team members since 2 years and proved to be quite stable in practise). BTRFS however does not pushes the limits as much as ZFS does: it does not have built-in snapshot differentiation tool nor implement built-in filesystem streaming capabilities and roll-backing a BTRFS subvolume is a bit more manual than in ''"the ZFS way of life"''.
+
 
+
  
= Footnotes & references =
+
[[Category:HOWTO]]
Source: [http://docs.huihoo.com/opensolaris/solaris-zfs-administration-guide/html/index.html solaris-zfs-administration-guide]
+
[[Category:Labs]]
+
[[Category:Articles]]
+
 
[[Category:Filesystems]]
 
[[Category:Filesystems]]
 +
[[Category:Featured]]
  
<references/>
+
__NOTITLE__

Revision as of 03:20, 5 March 2014

Contents

Introduction

This tutorial will show you how to install Funtoo on ZFS (rootfs). This tutorial is meant to be an "overlay" over the Regular Funtoo Installation. Follow the normal installation and only use this guide for steps 2, 3, and 8.

Introduction to ZFS

Since ZFS is a new technology for Linux, it can be helpful to understand some of its benefits, particularly in comparison to BTRFS, another popular next-generation Linux filesystem:

  • On Linux, the ZFS code can be updated independently of the kernel to obtain the latest fixes. btrfs is exclusive to Linux and you need to build the latest kernel sources to get the latest fixes.
  • ZFS is supported on multiple platforms. The platforms with the best support are Solaris, FreeBSD and Linux. Other platforms with varying degrees of support are NetBSD, Mac OS X and Windows. btrfs is exclusive to Linux.
  • ZFS has the Adaptive Replacement Cache replacement algorithm while btrfs uses the Linux kernel's Last Recently Used replacement algorithm. The former often has an overwhelmingly superior hit rate, which means fewer disk accesses.
  • ZFS has the ZFS Intent Log and SLOG devices, which accelerates small synchronous write performance.
  • ZFS handles internal fragmentation gracefully, such that you can fill it until 100%. Internal fragmentation in btrfs can make btrfs think it is full at 10%. Btrfs has no automatic rebalancing code, so it requires a manual rebalance to correct it.
  • ZFS has raidz, which is like RAID 5/6 (or a hypothetical RAID 7 that supports 3 parity disks), except it does not suffer from the RAID write hole issue thanks to its use of CoW and a variable stripe size. btrfs gained integrated RAID 5/6 functionality in Linux 3.9. However, its implementation uses a stripe cache that can only partially mitigate the effect of the RAID write hole.
  • ZFS send/receive implementation supports incremental update when doing backups. btrfs' send/receive implementation requires sending the entire snapshot.
  • ZFS supports data deduplication, which is a memory hog and only works well for specialized workloads. btrfs has no equivalent.
  • ZFS datasets have a hierarchical namespace while btrfs subvolumes have a flat namespace.
  • ZFS has the ability to create virtual block devices called zvols in its namespace. btrfs has no equivalent and must rely on the loop device for this functionality, which is cumbersome.

The only area where btrfs is ahead of ZFS is in the area of small file efficiency. btrfs supports a feature called block suballocation, which enables it to store small files far more efficiently than ZFS. It is possible to use another filesystem (e.g. reiserfs) on top of a ZFS zvol to obtain similar benefits (with arguably better data integrity) when dealing with many small files (e.g. the portage tree).

For a quick tour of ZFS and have a big picture of its common operations you can consult the page ZFS Fun.

Disclaimers

Warning: This guide is a work in progress. Expect some quirks.
Important: Since ZFS was really designed for 64 bit systems, we are only recommending and supporting 64 bit platforms and installations. We will not be supporting 32 bit platforms!

Video Tutorial

As a companion to the installation instructions below, a YouTube video tutorial is now available:

Downloading the ISO (With ZFS)

In order for us to install Funtoo on ZFS, you will need an environment that already provides the ZFS tools. Therefore we will download a customized version of System Rescue CD with ZFS included.

Name: sysresccd-4.0.1_zfs_0.6.2.iso  (545 MB)
Release Date: 2014-02-25
md5sum 01f4e6929247d54db77ab7be4d156d85


Download System Rescue CD with ZFS

Creating a bootable USB from ISO (From a Linux Environment)

After you download the iso, you can do the following steps to create a bootable USB:

Make a temporary directory
# mkdir /tmp/loop

Mount the iso
# mount -o ro,loop /root/sysresccd-4.0.1_zfs_0.6.2.iso /tmp/loop

Run the usb installer
# /tmp/loop/usb_inst.sh

That should be all you need to do to get your flash drive working.

Booting the ISO

Warning: When booting into the ISO, Make sure that you select the "Alternate 64 bit kernel (altker64)". The ZFS modules have been built specifically for this kernel rather than the standard kernel. If you select a different kernel, you will get a fail to load module stack error message.

Creating partitions

There are two ways to partition your disk: You can use your entire drive and let ZFS automatically partition it for you, or you can do it manually.

We will be showing you how to partition it manually because if you partition it manually you get to create your own layout, you get to have your own separate /boot partition (Which is nice since not every bootloader supports booting from ZFS pools), and you get to boot into RAID10, RAID5 (RAIDZ) pools and any other layouts due to you having a separate /boot partition.

gdisk (GPT Style)

A Fresh Start:

First lets make sure that the disk is completely wiped from any previous disk labels and partitions. We will also assume that /dev/sda is the target drive.

# sgdisk -Z /dev/sda
Warning: This is a destructive operation and the program will not ask you for confirmation! Make sure you really don't want anything on this disk.

Now that we have a clean drive, we will create the new layout.

First open up the application:

# gdisk /dev/sda

Create Partition 1 (boot):

Command: n ↵
Partition Number: 
First sector: 
Last sector: +250M ↵
Hex Code: 

Create Partition 2 (BIOS Boot Partition):

Command: n ↵
Partition Number: 
First sector: 
Last sector: +32M ↵
Hex Code: EF02 ↵

Create Partition 3 (ZFS):

Command: n ↵
Partition Number: 
First sector: 
Last sector: 
Hex Code: bf00 ↵

Command: p ↵

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          514047   250.0 MiB   8300  Linux filesystem
   2          514048          579583   32.0 MiB    EF02  BIOS boot partition
   3          579584      1953525134   931.2 GiB   BF00  Solaris root

Command: w ↵


Format your /boot partition

# mkfs.ext2 -m 1 /dev/sda1

Encryption (Optional)

If you want encryption, then create your encrypted vault(s) now by doing the following:

# cryptsetup luksFormat /dev/sda3
# cryptsetup luksOpen /dev/sda3 vault_1
Warning: On some machines, a combination of ZFS and LUKS has caused instability and system crashes.

Create the zpool

We will first create the pool. The pool will be named `tank` and the disk will be aligned to 4096 (using ashift=12)

# zpool create -f -o ashift=12 -o cachefile= -O compression=on -m none -R /mnt/funtoo tank /dev/sda3
Important: If you are using encrypted root, change /dev/sda3 to /dev/mapper/vault_1.
Note: If you have a previous pool that you would like to import, you can do a: zpool import -f -R /mnt/funtoo <pool_name>.

Create the zfs datasets

We will now create some datasets. For this installation, we will create a small but future proof amount of datasets. We will have a dataset for the OS (/), and your swap. We will also show you how to create some optional datasets: /home, /var, /usr/src, and /usr/portage.

Create some empty containers for organization purposes, and make the dataset that will hold /
# zfs create -p tank/funtoo
# zfs create -o mountpoint=/ tank/funtoo/root

Optional, but recommended datasets: /home
# zfs create -o mountpoint=/home tank/funtoo/home

Optional datasets: /usr/src, /usr/portage/{distfiles,packages}
# zfs create -o mountpoint=/usr/src tank/funtoo/src
# zfs create -o mountpoint=/usr/portage -o compression=off tank/funtoo/portage
# zfs create -o mountpoint=/usr/portage/distfiles tank/funtoo/portage/distfiles
# zfs create -o mountpoint=/usr/portage/packages tank/funtoo/portage/packages

Create your swap zvol

For modern machines that have greater than 4 GB of RAM, A swap size of 2G should be enough. However if your machine doesn't have a lot of RAM, the rule of thumb is either 2x the RAM or RAM + 1 GB.

For this tutorial we will assume that it is a newer machine and make a 2 GB swap.

# zfs create -o sync=always -o primarycache=metadata -o secondarycache=none -o volblocksize=4K -V 2G tank/swap

Format your swap zvol

# mkswap -f /dev/zvol/tank/swap
# swapon /dev/zvol/tank/swap

Now we will continue to install funtoo.

Installing Funtoo

Pre-Chroot

Go into the directory that you will chroot into
# cd /mnt/funtoo

Make a boot folder and mount your boot drive
# mkdir boot
# mount /dev/sda1 boot

Now download and extract the Funtoo stage3 ...

Once you've extracted the stage3, do a few more preparations and chroot into your new funtoo environment:

Bind the kernel related directories
# mount -t proc none proc
# mount --rbind /dev dev
# mount --rbind /sys sys

Copy network settings
# cp -f /etc/resolv.conf etc

Make the zfs folder in 'etc' and copy your zpool.cache
# mkdir etc/zfs
# cp /etc/zfs/zpool.cache etc/zfs

Chroot into Funtoo
# env -i HOME=/root TERM=$TERM chroot . bash -l

In Chroot

Create a symbolic link to your mountpoints
# ln -sf /proc/mounts /etc/mtab

Sync your tree
# emerge --sync

Add filesystems to /etc/fstab

Before we continue to compile and or install our kernel in the next step, we will edit the /etc/fstab file because if we decide to install our kernel through portage, portage will need to know where our /boot is, so that it can place the files in there.

Edit /etc/fstab:

# <fs>                  <mountpoint>    <type>          <opts>          <dump/pass>

/dev/sda1               /boot           ext2            defaults        0 2
/dev/zvol/tank/swap     none            swap            sw              0 0

Kernel Configuration

To speed up this step, you can install a pre-configured/compiled kernel called bliss-kernel. This kernel already has the correct configurations for ZFS and a variety of other scenarios. It's a vanilla kernel from kernel.org without any external patches.

To install sys-kernel/bliss-kernel type the following:

# emerge bliss-kernel

Now make sure that your /usr/src/linux symlink is pointing to this kernel by typing the following:

# eselect kernel list
Available kernel symlink targets:
[1]   linux-3.12.13-KS.02 *

You should see a star next to the version you installed. In this case it was 3.12.13-KS.02. If it's not set, you can type eselect kernel set #.

Installing the ZFS userspace tools and kernel modules

Emerge sys-fs/zfs. This package will bring in sys-kernel/spl, and sys-fs/zfs-kmod as its dependencies:

# emerge zfs

Check to make sure that the zfs tools are working. The zpool.cache file that you copied before should be displayed.

# zpool status
# zfs list

If everything worked, continue.

Installing & Configuring the Bootloader

GRUB 2 (Optional if you are using another bootloader)

# emerge grub

You can check that grub is version 2.00 by typing the following command:

# grub-install --version
grub-install (GRUB) 2.00

Now install grub to the drive itself (not a partition):

# grub-install /dev/sda

You should receive the following message:

Installation finished. No error reported.

You should now see some a grub directory with some files inside your /boot folder:

# ls -l /boot/grub
total 2520
-rw-r--r-- 1 root root    1024 Jan  4 16:09 grubenv
drwxr-xr-x 2 root root    8192 Jan 12 14:29 i386-pc
drwxr-xr-x 2 root root    4096 Jan 12 14:28 locale
-rw-r--r-- 1 root root 2555597 Feb  4 11:50 unifont.pf2

Extlinux (Optional if you are using another bootloader)

To install extlinux, you can follow the guide here: Link to Extlinux Guide.

LILO (Optional if you are using another bootloader)

To install lilo you can type the following:

# emerge lilo

boot-update

boot-update comes as a dependency of grub2, so if you already installed grub, it's already on your system!

Genkernel

If your using genkernel you must add 'real_root=ZFS=<root>' and 'dozfs' to your params. Example entry for /etc/boot.conf:

"Funtoo ZFS" {
        kernel vmlinuz[-v]
        initrd initramfs-genkernel-x86_64[-v]
        params real_root=ZFS=tank/funtoo/root
        params += dozfs=force
        # Also add 'params += crypt_root=/dev/sda3' if you used encryption
        # Adjust the above setting to your system if needed
}

Bliss Initramfs Creator

If you used Bliss Initramfs Creator then all you need to do is add 'root=<root>' to your params. Example entry for /etc/boot.conf:

"Funtoo ZFS" {
        kernel vmlinuz[-v]
        initrd initrd[-v]
        params root=tank/funtoo/root quiet
        # If you have an encrypted device with a regular passphrase,
        # you can add the following line
        params += enc_root=/dev/sda3 enc_type=pass
}

After editing /etc/boot.conf, you just need to run boot-update to update grub.cfg

# boot-update

bliss-boot

This is a new program that is designed to generate a simple, human-readable/editable, configuration file for a variety of bootloaders. It currently supports grub2, extlinux, and lilo.

You can install it via the following command:

# emerge bliss-boot

Bootloader Configuration

In order to generate our bootloader configuration file, we will first configure bliss-boot so that it knows what we want. The 'bliss-boot' configuration file is located in /etc/bliss-boot/conf.py. Open that file and make sure that the following variables are set appropriately:

# This should be set to the bootloader you installed earlier: (grub2, extlinux, and lilo are the available options)
bootloader = "grub2"

# This should be set to the kernel you installed earlier
default = "3.12.13-KS.02" 

Scroll all the way down until you find 'kernels'. You will need to add the kernels and the options you want for these kernels here. Below are a few configuration options depending if you are using bliss-initramfs or genkernel.

Genkernel
kernel = {
    '3.12.13-KS.02' : 'real_root=ZFS=tank/funtoo/root dozfs=force quiet',
}

If you are using encryption you can add the crypt_root option:

kernel = {
    '3.12.13-KS.02' : 'real_root=ZFS=tank/funtoo/root dozfs=force crypt_root=/dev/sda3 quiet',
}
Bliss Initramfs Creator
kernel = {
    '3.12.13-KS.02' : 'root=tank/funtoo/root quiet',
}

If you are using encryption then you would let the initramfs know:

  1. "What type of encryption authentication you want to use? (enc_type=)
  • pass = will ask for passphrase directly
  • key = a plain unencrypted key file
  • key_gpg = an encrypted key file
  1. "Where is the encrypted drive?" (enc_root=)
  2. "Where is the root pool after it has been decrypted?" (root=)
kernel = {
    '3.12.13-KS.02' : 'root=tank/funtoo/root enc_root=/dev/sda3 enc_type=pass quiet',
}

Generate the configuration

Now that we have configure our /etc/bliss-boot/conf.py file, we can generate our config. Simply run the following command:

# bliss-boot

This will generate a configuration file for the bootloader you specified previously in your current directory. You can check your config file before hand to make sure it doesn't have any errors. Simply open either: grub.cfg, extlinux.conf, or lilo.conf.

Once you have checked it for errors, place this file in the correct directory:

  • grub2 = /boot/grub/
  • extlinux = /boot/extlinux/
  • lilo = /etc/lilo.conf

LILO (Optional if you are using another bootloader)

Now that bliss-boot generated the lilo.conf file, move that config file to its appropriate location and install lilo to the MBR:

# mv lilo.conf /etc
# lilo

You should see the following:

Warning: LBA32 addressing assumed
Added Funtoo + *
One warning was issued

Create the initramfs

There are two ways to do this, you can use "genkernel" or "bliss-initramfs". Both will be shown.

genkernel

Install genkernel and run it:

# emerge genkernel

You only need to add --luks if you used encryption
# genkernel --zfs --luks initramfs

Bliss Initramfs Creator

If you are encrypting your drives, then add the "luks" use flag to your package.use before emerging:

# echo "sys-kernel/bliss-initramfs luks" >> /etc/portage/package.use

Now install the program and run it:

# emerge bliss-initramfs

You can either run it without any parameters to get an interactive menu
or you can pass the parameters directly. 1 = zfs, 6 = encrypted zfs, and the kernel name.
# bliss-initramfs 1 3.12.13-KS.02

Moving into the correct location

Place the file that was generated by the above applications into either your /boot folder (If you are using boot-update) or into your /boot/kernels/3.12.13-KS.02 folder (If you are using bliss-boot). For bliss-boot, the file needs to be called 'initrd' rather than 'initrd-3.12.13-KS.02'.

boot-update

# mv initrd-3.12.13-KS.02 /boot

bliss-boot

# mv initrd-3.12.13-KS.02 /boot/kernels/3.12.13-KS.02/initrd

Final configuration

Add the zfs tools to openrc

# rc-update add zfs boot

Clean up and reboot

We are almost done, we are just going to clean up, set our root password, and unmount whatever we mounted and get out.

Delete the stage3 tarball that you downloaded earlier so it doesn't take up space.
# cd /
# rm stage3-latest.tar.xz

Set your root password
# passwd
>> Enter your password, you won't see what you are writing (for security reasons), but it is there!

Get out of the chroot environment
# exit

Unmount all the kernel filesystem stuff and boot (if you have a separate /boot)
# umount -l proc dev sys boot

Turn off the swap
# swapoff /dev/zvol/tank/swap

Export the zpool
# cd /
# zpool export tank

Reboot
# reboot
Important: Don't forget to set your root password as stated above before exiting chroot and rebooting. If you don't set the root password, you won't be able to log into your new system.

and that should be enough to get your system to boot on ZFS.

After reboot

Forgot to reset password?

System Rescue CD

If you aren't using bliss-initramfs, then you can reboot back into your sysresccd and reset through there by mounting your drive, chrooting, and then typing passwd.

Example:

# zpool import -f -R /mnt/funtoo tank
# chroot /mnt/funtoo bash -l
# passwd
# exit
# zpool export -f tank
# reboot

Using bliss-initramfs

If you forgot to reset your password and are using bliss-initramfs, you can add the su option to your bootloader parameters and the initramfs will throw you into the rootfs of your drive. In there you can run 'passwd' and then type 'exit'. Once you type 'exit', the initramfs will continue to boot your system as normal.

Create initial ZFS Snapshot

Continue to set up anything you need in terms of /etc configurations. Once you have everything the way you like it, take a snapshot of your system. You will be using this snapshot to revert back to this state if anything ever happens to your system down the road. The snapshots are cheap, and almost instant.

To take the snapshot of your system, type the following:

# zfs snapshot -r tank@install

To see if your snapshot was taken, type:

# zfs list -t snapshot

If your machine ever fails and you need to get back to this state, just type (This will only revert your / dataset while keeping the rest of your data intact):

# zfs rollback tank/funtoo/root@install
Important: For a detailed overview, presentation of ZFS' capabilities, as well as usage examples, please refer to the ZFS Fun page.

Troubleshooting

Starting from scratch

If your installation has gotten screwed up for whatever reason and you need a fresh restart, you can do the following from sysresccd to start fresh:

Destroy the pool and any snapshots and datasets it has
# zpool destroy -R -f tank

This deletes the files from /dev/sda1 so that even after we zap, recreating the drive in the exact sector
position and size will not give us access to the old files in this partition.
# mkfs.ext2 /dev/sda1
# sgdisk -Z /dev/sda

Now start the guide again :).