User:Mrl5/Btrfs

From Funtoo
< User:Mrl5
Revision as of 21:05, September 1, 2022 by Mrl5 (talk | contribs) (translation marks)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Btrfs is a file system based on the copy-on-write (COW) principle, initially designed at Oracle Corporation for use in Linux. The development of btrfs began in 2007, and since August 2014 the file system's on-disk format has been marked as stable.

The Funtoo Linux project recommends btrfs as a next-generation filesystem, particularly for use in production.

Btrfs is intended to address the lack of pooling, snapshots, checksums, and integral multi-device spanning in Linux file systems.

It is easy to set up and use btrfs. In this simple introduction, we're going to set up btrfs under Funtoo Linux using an existing debian-sources or debian-sources-lts kernel, like the one that comes pre-built for you with Funtoo Linux, and we will also be using our btrfs storage pool for storing data that isn't part of the Funtoo Linux installation itself. Funtoo Linux will boot from a non-btrfs filesystem, and as part of the initialization process will initialize our btrfs storage and mount it at the location of our choice.

Installation

Enabling btrfs support is as simple as enabling the btrfs mix-in and running a world update:

root # epro mix-in +btrfs
root # emerge -uDN @world

Btrfs is now ready for use.

Btrfs Concepts

Btrfs can be used to manage the physical disks that it uses, and physical disks are added to a Btrfs volume. Then, BTRFS can create subvolumes from the volume on which files can be stored.

Unlike traditional Linux filesystems, btrfs filesystems will allocate storage on-demand from the underlying volume.

In the btrfs world, the word volume corresponds to a storage pool (ZFS) or a volume group (LVM).

  • devices - one or multiple underlying physical volumes.
  • volume - one large storage pool comprised of all space of the devices and can support different redundancy levels
  • subvolumes - these are what get mounted and you store files in.
  • snapshots - a read-only copy of a subvolume at a given point in time and/or read-write copy of a subvolume in time (aka clone).

Creating a Volume

To create a basic btrfs volume, you will need an extra empty disk. Perform the following steps:

root #  mkfs.btrfs /dev/sdxy
btrfs-progs v4.17.1 
See http://btrfs.wiki.kernel.org for more information.

Detected a SSD, turning off metadata duplication.  Mkfs with -m dup if you want to force metadata duplication.
Performing full device TRIM /dev/sdj (223.57GiB) ...
Label:              (null)
UUID:               d6bcba6e-8fd5-41fc-9bb4-79628c5c928c
Node size:          16384
Sector size:        4096
Filesystem size:    223.57GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         single            8.00MiB
  System:           single            4.00MiB
SSD detected:       yes
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1   223.57GiB  /dev/sdxy

/dev/sdxy should be an unused disk. You may need to use the following command if this disk contains any pre-existing data on it:

root #  mkfs.btrfs -f /dev/sdxy

Now you can mount the created volume as you would mount any other linux filesystem.

root #  mkdir /mnt/btrfs-top-level
root #  mount /dev/sdxy /mnt/btrfs-top-level
root #  mount
...
/dev/sdxy on /mnt/btrfs-top-level type btrfs (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
   Important

It is recommended that nothing is stored directly on this top-level volume (ID 5) root directory.

Creating Subvolumes

Btrfs has a concept of subvolumes. Subvolume is an independently mountable POSIX filetree (but not a block device). There are several basic schemas to layout subvolumes (including snapshots) as well as mixtures thereof.

Lets create children of the top level subvolume (ID 5). We will have:

  • @data - it will serve as mountable /data
  • .snapshots - here snapshots will be stored
root #  cd /mnt/btrfs-top-level
root #  btrfs subvolume create @data
root #  btrfs subvolume create .snapshots
root #  btrfs subvolume list /mnt/btrfs-top-level
ID 256 gen 322338 top level 5 path @data
ID 257 gen 322275 top level 5 path .snapshots

The default Subvolume

   Note

Changing the default subvolume with btrfs subvolume default will make the top level of the filesystem accessible only when subvol or subvolid mount options are specified

When btrfs block device is mounted without specifying a subvolume the default one is used. To check default subvolume run

root #  btrfs subvolume get-default /mnt/btrfs-top-level
ID 5 (FS_TREE)

For the convenience lets make @data subvolume as the default one. It's good to double check the subvolume ID first. Either btrfs subvolume list or btrfs subvolume show can be used for that

root #  btrfs subvolume show /mnt/btrfs-top-level/@data
...
	Subvolume ID: 		256

Now you can make this subvolume as a default one

root #  btrfs subvolume set-default 256 /mnt/btrfs-top-level
root #  btrfs subvolume get-default /mnt/btrfs-top-level
ID 256 gen 322336 top level 5 path @data

At this point you can stop working on the top level subvolume (ID 5) and instead mount directly @data subvolume.

root #  cd /mnt
root #  umount /mnt/btrfs-top-level
root #  mkdir /data
root #  mount /dev/sdxy /data

Nested Subvolumes

   Note

Nested subvolumes are not going to be a part of snapshots created from their parent subvolume. So one typical reason is to exclude certain parts of the filesystem from being snapshot.

Lets create a separate nested subvolume for /data/independent.

root #  btrfs subvolume create /data/independent
root #  btrfs subvolume list /data
ID 258 gen 161 top level 256 path independent

Usually you will want to "split" areas which are "complete" and/or "consistent" in themselves. Examples for this more-fine grained partitioning could be /var/log, /var/www or /var/lib/postgresql.

/etc/fstab

To automatically mount the @data subvolume after reboot you need to modify /etc/fstab

   /etc/fstab - fstab for btrfs
/dev/sdxy	/data	btrfs	subvolid=256,defaults	0 0
   Warning

According to btrfs docs most mount options apply to the whole filesystem and only options in the first mounted subvolume will take effect. This means that (for example) you can't set per-subvolume nodatacow, nodatasum, or compress.

Now lets verify if this changes were correct

root #  cd /
root #  umount /data
root #  mount /data
root #  ls /data
independent

Did you just notice that although we mounted our @data subvolume the nested subvolume @data/independent is also present?

Snapshots

For the purpose of checking out this cool btrfs feature lets populate our filesystem with some example data first

root #  echo 'btrfs' > /data/foo.txt
root #  echo 'fun' > /data/independent/bar.txt

As you probably remember on the top level (next to @data subvolume) you've also created the .snapshots subvolume. You can mount it now to create some snapshots

root #  mkdir /mnt/snapshots
root #  mount /dev/sdxy /mnt/snapshots -o subvolid=257

A snapshot is a subvolume like any other, with given initial content. By default, snapshots are created read-write. File modifications in a snapshot do not affect the files in the original subvolume. Lets create a read-write snapshot for /data and read-only snapshot for /data/independent

root #  btrfs subvolume snapshot /data /mnt/snapshots/data_$(date -u -Iseconds)
Create a snapshot of '/data' in '/mnt/snapshots/data_2022-08-30T22:04:57+00:00'
root #  btrfs subvolume snapshot -r /data/independent /mnt/snapshots/independent_$(date -u -Iseconds)
Create a readonly snapshot of '/data/independent' in '/mnt/snapshots/independent_2022-08-30T22:05:29+00:00'

Once again, nested subvolumes are not going to be a part of snapshots created from their parent subvolume. So you shouldn't be surprised when you compare the contents of the /data vs the contents of the /mnt/snapshots

root #  tree /data
/data
├── foo.txt
└── independent
    └── bar.txt
root #  tree /mnt/snapshots
/mnt/snapshots
├── data_2022-08-30T22:04:57+00:00
│   └── foo.txt
└── independent_2022-08-30T22:05:29+00:00
    └── bar.txt

At this point you might be interested in send and receive btrfs features.

   Note

According to btrfs docs a snapshot is not a backup: snapshots work by use of BTRFS copy-on-write behaviour. A snapshot and the original it was taken from initially share all of the same data blocks. If that data is damaged in some way (cosmic rays, bad disk sector, accident with dd to the disk), then the snapshot and the original will both be damaged.

Wrap up

   Important

It is recommended to run btrfs scrub once in a while. E.g. every month

Scrub is the online check and repair functionality that verifies the integrity of data and metadata, assuming the tree structure is fine. You can run it on a mounted file system; it runs as a background process during normal operation.

To start a (background) scrub on the filesystem which contains /data run

root #  btrfs scrub start /data
scrub started on /data, fsid 40f8b94f-07ee-4f7e-beb1-8e686abc246d (pid=5525)

To check the status of a running scrub

root #  btrfs scrub status /data
UUID:             40f8b94f-07ee-4f7e-beb1-8e686abc246d
Scrub started:    Tue Aug 30 00:38:54 2022
Status:           running
Duration:         0:00:15
Time left:        0:00:34
ETA:              Tue Aug 30 00:39:44 2022
Total to scrub:   149.06GiB
Bytes scrubbed:   44.79GiB  (30.04%)
Rate:             2.99GiB/s
Error summary:    no errors found

You should now be at the point where you can begin to use btrfs for a variety of tasks. While there is a lot more to btrfs than what is covered in this short introduction, you should now have a good understanding of the fundamental concepts on which btrfs is based.