LVM Fun

From Funtoo Linux
Revision as of 02:52, 28 December 2011 by 404 Error (Talk)

Jump to: navigation, search

Contents

Introduction

LVM (Logical Volume Management) offers a great flexibility in managing your storage and significantly reduces server downtimes by allowing on-line disk space management: The great idea beneath LVM is to make the data and its storage loosely coupled through several layers of abstraction. You (the system administrator) have the hand of each of those layers making the entire space management process extremely simple and flexible though various set of coherent commands.

Several other well-known binary Linux distributions makes an aggressive use of LVM and several Unices including HP-UX, AIX and Solaris offers since a while a similar functionality modulo the commands to be used. LVM is not mandatory but its usage can bring you additional flexibility and make your everyday life much more simpler.

Concepts

As usual having a good idea of the concepts lying beneath is mandatory. LVM is not very complicated but it is easy to become confused especially because it is multi-layered system, however LVM designers had the good idea of keeping the command names consistent between all LVM command sets making your life easier.

LVM consists of, mainly, three things:

  • Physical volumes (or PV): nothing more than a physical storage space. A physical volume can by anything like a partition on a local hard disk, a partition located on a remote SAN disk, a USB key or whatever else that could offer a storage space (so yes, technically it could be possible to use an optical storage device accessed in packet writing mode). The storage space on a physical volumes is divided (and managed) in small units called Physical Extents (or PE). Just to give an analogy if you are a bit familiar with RAID, PE are a bit like RAID stripes.
  • Volume Groups (or VG): a group of at least one PV. VG are named entities and will appear in the system via the device mapper as /dev/volume-group-name.
  • Logical Volumes (or LV): a named division of a volume group in which a filesystem is created and that can be mounted in the VFS. Just for the record, just as for the PE in PV, a LV is managed as chucks known as Logical Extents (or LE). Most of the time those LE are hidden to the system administrator due to a 1:1 mapping between them and the PE lying be just beneath but a cool fact to know about LEs is that they can be spread over PV just like RAID stripes in a RAID-0 volume. However, researches done on the Web tends to demonstrate system administrators prefer to build RAID volumes with mdadm then use LVM over them for performance reasons.

In short words: LVM logical volumes (LV) are containers that can hold a single filesystem and which are created inside a volume group (VG) itself composed by an aggregation of at least one physical volumes (PV) themselves stored on various media (usb key, harddisk partition and so on). The data is stored in chunks spread over the various PV.

Retain what PV, VG and LV means as we will use those abbreviations in the rest of this article.


Your first tour of LVM

Physical volumes creation

We give the same size to all volumes for the sake of the demonstration. This is not mandatory and be possible to have mixed sizes PV inside a same VG.


To start with, just create three raw disk images:

# dd if=/dev/zero of=/tmp/hdd1.img bs=2G count=1
# dd if=/dev/zero of=/tmp/hdd2.img bs=2G count=1
# dd if=/dev/zero of=/tmp/hdd3.img bs=2G count=1

and associate them to a loopback device:

# losetup -f
/dev/loop0 
# losetup /dev/loop0 /tmp/hdd1.img
# losetup /dev/loop1 /tmp/hdd2.img
# losetup /dev/loop2 /tmp/hdd3.img

Okay nothing really exciting there, but wait the fun is coming! First check that sys-fs/lvm2 is present on your system and emerge it if not. At this point, we must tell you a secret: although several articles and authors uses the taxonomy "LVM" it denotes "LVM version 2" or "LVM 2" nowadays. You must know that LVM had, in the old good times (RHEL 3.x and earlier), a previous revision known as "LVM version 1". LVM 1 is now considered as an extincted specie and is not compatible with LVM 2, although LVM 2 tools maintain a backward compatibility.

The very frst step in LVM is to create the physical devices or PV. "Wait create what?! Aren't the loopback devices present on the system?" Yes they are present but they are empty, we must initialize them some metadata to make them usable by LVM. This is simply done by:

# pvcreate /dev/loop0
  Physical volume "/dev/loop0" successfully created
# pvcreate /dev/loop1
  Physical volume "/dev/loop1" successfully created
# pvcreate /dev/loop2
  Physical volume "/dev/loop2" successfully created

It is absolutely normal that nothing in particular is printed at the output of each command but we assure you: you have there LVM PVs. You can check them by issuing:

# pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/loop0      lvm2 a-   2.00g 2.00g
  /dev/loop1      lvm2 a-   2.00g 2.00g
  /dev/loop2      lvm2 a-   2.00g 2.00g


Some good information there:

  • PV: indicates the physical path the PV lies on
  • VG indicates the VG the PV belongs to. At this time, we didn't created any VG yet and the column remains empty.
  • Fmt: indicates the format of the PV (here it says we have a LVM version 2 PV)
  • Attrs: indicates some status information, the 'a' here just says that the PV is accessible.
  • PSize and PFree: indicates the PV size and the amount of remaining space for this PV. Here we have three empty PV so it bascially says "2 gigabytes large, 2 out of gigabytes free"

It is now time to introduce you to another command: pvdisplay. Just run it without any arguments:

 pvdisplay
  "/dev/loop0" is a new physical volume of "2.00 GiB"
  --- NEW Physical volume ---
  PV Name               /dev/loop0
  VG Name               
  PV Size               2.00 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               b9i1Hi-llka-egCF-2vU2-f7tp-wBqh-qV4qEk
   
  "/dev/loop1" is a new physical volume of "2.00 GiB"
  --- NEW Physical volume ---
  PV Name               /dev/loop1
  VG Name               
  PV Size               2.00 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               i3mdBO-9WIc-EO2y-NqRr-z5Oa-ItLS-jbjq0E
   
  "/dev/loop2" is a new physical volume of "2.00 GiB"
  --- NEW Physical volume ---
  PV Name               /dev/loop2
  VG Name               
  PV Size               2.00 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               dEwVuO-a5vQ-ipcH-Rvlt-5zWt-iAB2-2F0XBf

The third three lines of each PV shows:

  • what is the storage device beneath a PV
  • the VG it is tied to
  • the size of this PV.

Allocatable indicates whether the PV is used to store data. As the PV is not a member of a VG, it cannot not be used (yet) hence the "NO" shown. Another set of information is the lines starting with PE. PE stands for Physical Extents (data stripe) and is the finest granularity LVM can manipulate. The size of a PE is "0" here because we have a blank PV however it typically holds 32 MB of data. Following PE Size are Total PE which show the the total number of PE available on this PV and Free PE the number of PE remaining available for use. Allocated PE just show the difference between Total PE and Free PE.

The latest line (PV UUID) is a unique identifier used internally by LVM to name the PV. You have to know that it exists because it is sometimes useful when having to recover from corruption or do weird things with PV however most of the time you don't have to worry about its existence.

It is possible to force how LVM handles the alignments on the physical storage. This is useful when dealing with 4K sectors drives that lies on their physical sectors size. Refer to the manual page.



Volume group creation

We have the blank PV at this time but to make them a bit more usable for storage we must tell to LVM how they are grouped to form a VG (storage pool) where LV will be created. A nice aspect of VGs resides in the fact that they are not "written in the stone" once created: you can still add, remove or exchange PV (in the case the device the PV is stored on fails for example) inside a VG at a later time. To create our first volume group named vgtest:

# vgcreate vgtest /dev/loop0 /dev/loop1 /dev/loop2
  Volume group "vgtest" successfully created

Just like we did before with PV, we can get a list of what are the VG known by the system. This is done through the command vgs:

# vgs
  VG     #PV #LV #SN Attr   VSize VFree
  vgtest   3   0   0 wz--n- 5.99g 5.99g

vgs show you a tabluar view of information:

  • VG: the name of the VG
  • #PV: the number of PV composing the VG
  • #LV: the number of logical volumes (LV) located inside the VG
  • Attrs: a status field. w, z and n here means that VG is:
    • w: writable
    • z: resizable
    • n: using the allocation policy normal (tweaking allocation policies is beyond the scope of this article, we will use the default value normal in the rest of this article)
  • VSize and VFree gives statistics on how full a VG is versus its size

Note the dashes in Attrs, they mean that the attribute is not active:

  • First dash (3rd position) indicates if the VG would have been exported (a 'x' would have been showed at this position in that case).
  • Second dash (4th position) indicates if the VG would have been partial (a 'p' would have been showed at this position in that case).
  • Third dash (rightmost position) indicates if the VG is a clustered (a 'c' would have been showed at this position in that case).

Exporting a VG and clustered VG are a bit more advanced aspects of LVM and won't be covered here especially the clustered VGs which are used in the case of a shared storage space used in a cluster of machines. Talking about clustered VGs management in particular would require and entire article in itself. For now the only detail you have to worry about those dashes in Attrs is to see a dash at the 4th position of Attrs instead of a p. Seeing p there would be a bad news: the VG would have missing parts (PV) making it not usable.

In the exact same manner you can see a detailed information about physical volumes with pvdisplay, you can see detailed information of a volume group with vgdisplay. We will demonstrate that latter command in the next paragraphs.


Before leaving the volume group aspect, do you remember the pvs command shown in the previous paragraphs? Try it gain:

# pvs
  PV         VG     Fmt  Attr PSize PFree
  /dev/loop0 vgtest lvm2 a-   2.00g 2.00g
  /dev/loop1 vgtest lvm2 a-   2.00g 2.00g
  /dev/loop2 vgtest lvm2 a-   2.00g 2.00g

Now it shows the VG our PVs belong to :-)

Logical volumes creation

Now the final steps: we will create the storage areas (logical volumes or LV) inside the VG where we will then create filesystems on. Just like a VG has a name, a LV has also a name which is unique in the VG.

Two LV can be given the same name as long as they are located on a different VG.


To divide our VG like below:

  • lvdata1: 2 GB
  • lvdata2: 1 GB
  • lvdata3 : 10% of the VG size
  • lvdata4 : All of remaining free space in the VG

We use the following commands (notice the capital 'L' and the small 'l' to declare absolute or relative sizes):

# lvcreate -n lvdata1 -L 2GB vgtest
  Logical volume "lvdata1" created
#  lvcreate -n lvdata2 -L 1GB vgtest
  Logical volume "lvdata2" created
# lvcreate -n lvdata3 -l 10%VG vgtest
  Logical volume "lvdata2" created

What is going on so far? Let's check with the pvs/vgs counterpart known as lvs:

# lvs
  LV      VG     Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  lvdata1 vgtest -wi-a-   2.00g                                      
  lvdata2 vgtest -wi-a-   1.00g                                      
  lvdata3 vgtest -wi-a- 612.00m
# 

Notice the size of lvdata3, it is roughly 600MB (10% of 6GB). How much free space remains in the VG? Time to see what vgs and vgdisplay returns:

# vgs
  VG     #PV #LV #SN Attr   VSize VFree
  vgtest   3   3   0 wz--n- 5.99g 2.39g
# vgdisplay 
  --- Volume group ---
  VG Name               vgtest
  System ID             
  Format                lvm2
  Metadata Areas        3
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               0
  Max PV                0
  Cur PV                3
  Act PV                3
  VG Size               5.99 GiB
  PE Size               4.00 MiB
  Total PE              1533
  Alloc PE / Size       921 / 3.60 GiB
  Free  PE / Size       612 / 2.39 GiB
  VG UUID               baM3vr-G0kh-PXHy-Z6Dj-bMQQ-KK6R-ewMac2

Basically it say we have 1533 PE (chunks) available for a total size of 5.99 GiB. On those 1533, 921 are used (for a size of 3.60 GiB) and 612 remains free (for a size of 2.39 GiB). So we expect to see lvdata4 having an approximative size of 2.4 GiB. Before creating it, have a look at some statistics at the PV level:

# pvs
  PV         VG     Fmt  Attr PSize PFree  
  /dev/loop0 vgtest lvm2 a-   2.00g      0 
  /dev/loop1 vgtest lvm2 a-   2.00g 404.00m
  /dev/loop2 vgtest lvm2 a-   2.00g   2.00g

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/loop0
  VG Name               vgtest
  PV Size               2.00 GiB / not usable 4.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              511
  Free PE               0
  Allocated PE          511
  PV UUID               b9i1Hi-llka-egCF-2vU2-f7tp-wBqh-qV4qEk
   
  --- Physical volume ---
  PV Name               /dev/loop1
  VG Name               vgtest
  PV Size               2.00 GiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              511
  Free PE               101
  Allocated PE          410
  PV UUID               i3mdBO-9WIc-EO2y-NqRr-z5Oa-ItLS-jbjq0E
   
  --- Physical volume ---
  PV Name               /dev/loop2
  VG Name               vgtest
  PV Size               2.00 GiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              511
  Free PE               511
  Allocated PE          0
  PV UUID               dEwVuO-a5vQ-ipcH-Rvlt-5zWt-iAB2-2F0XBf

Quite interesting! Did you notice? The first PV is full, the second is more or less full and the third is empty. This is due to the allocation policy used for the VG: it fills its first PV then its second PV and then its third PV (this, by the way, gives you a chance to recover from a dead physical storage if by luck none of your PE was present on it).

It is now time to create our last LV, again notice the small 'l' to specify a relative size:

# lvcreate -n lvdata4 -l 100%FREE vgtest
  Logical volume "lvdata4" created
# lvs
  LV      VG     Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  lvdata1 vgtest -wi-a-   2.00g                                      
  lvdata2 vgtest -wi-a-   1.00g                                      
  lvdata3 vgtest -wi-a- 612.00m                                      
  lvdata4 vgtest -wi-a-   2.39g

Now everything is in place, if you want just check again with vgs/pvs/vgdisplay/pvdisplay and will notice that the VG is now 100% full and all of the underlying PV are also 100% full.

Filesystems creation

Personal tools
Namespaces

Variants
Actions
Categories
Toolbox
Stuff