Difference between pages "Xen" and "Sed by Example, Part 1"

(Difference between pages)
 
 
Line 1: Line 1:
'''Funtoo Xen Fun'''
+
{{Article
We are talking about Xen on Funtoo Linux and how to setup Xen virtualization properly.
+
|Author=Drobbins
Especially, we are going to show you how much fun it is to work with Xen hosts and domU's and
+
|Next in Series=Sed by Example, Part 2
setting up a Funtoo Xen Server without general clicky GUI's or other frontends. This is true hardcore OS Xen setup especially for NOC server systems, headless servers, etc..
+
 
+
= Funtoo Xen Server with paravirt funtoo domU =
+
'''Assumptions'''
+
''We build a 64bit headless XEN hypervisor rockstable and rocket fast with a funtoo headless 64bit paravirt domU.''
+
We are '''not''' building Xen with pvgrub or hvm (which is kinda slow and overhead as long as you don't want to install Windoze).
+
 
+
== Buiding Funtoo Xen Host Dom0 ==
+
Most of the necessary steps are covered in the Installation Tutorial.
+
We only do outline here the steps that are necessary to enjoy an easy and successful Dom0 setup or if something differs from the normal installation tutorial.
+
 
+
Please, open in a second tab the [[Installation (Tutorial)|Installation Tutorial]] and follow in both carefully the next steps!
+
 
+
=== Basic Funtoo Xen Host Dom0 setup ===
+
 
+
I recommend you use only stable packages for the host dom0 !
+
 
+
Please consider the decision carefully. I can't stress out enough, you will avoid a lot of problems taking the stable distrib as dom0.
+
The domU guests could be either unstable or hardened, as you wish! There comes the true fun part ;-)
+
That's why I first edit my make.conf befor building anything!
+
 
+
Here is how I set up the system basics:
+
Disk is <tt>/dev/sda</tt>
+
 
+
<pre>
+
/dev/sda1 is our / partition ca 20GB ext4
+
/dev/sda2 is our swap partition ca 4GB
+
/dev/sda3 holds the lvm volume group vgxen
+
</pre>
+
 
+
I am using volume groups over raid - which I strongly advice to everybody.
+
 
+
Store of xen stuff:
+
<pre>/etc/xen/ --> xend configuration files
+
/xen/configs/ --> my xen domU configuration files folder
+
/xen/kernel/ --> my xen domU kernel folder
+
/xen/disks/ --> my xen domU image files folder
+
</pre>
+
 
+
Edit <tt>/etc/rc.conf</tt> and uncomment the line at the bottom for rc_sys
+
<pre>rc_sys="xen0"</pre>
+
 
+
== Configure and Build Xen Dom0 Kernel ==
+
<console>
+
###i## emerge gentoo-sources
+
###i## cd /usr/src/linux
+
###i## make menuconfig
+
</console>
+
 
+
These settings are current as of 3.2.1-gentoo-r2, other versions may vary:
+
 
+
{{kernelop
+
|title=
+
|desc=
+
General setup  --->
+
  <*> Kernel .config support
+
      [*]  Enable access to .config through /proc/config.gz
+
 
+
Processor type and features  --->
+
  [*] Paravirtualized guest support  --->
+
      [*]  Xen guest support
+
 
+
Bus options (PCI etc.)  --->
+
  [*]  Xen PCI Frontend 
+
 
+
[*] Networking support  --->
+
  Networking options  --->
+
      <*> 802.1d Ethernet Bridging
+
 
+
Device Drivers  --->
+
  [*] Block devices (NEW)  --->
+
      <M>  DRBD Distributed Replicated Block Device support
+
      < >  Xen virtual block device support
+
      <*>  Xen block-device backend driver
+
 
+
Device Drivers  --->
+
  [*] Network device support  --->
+
      < >  Xen network device frontend driver
+
      <*>  Xen backend network device
+
 
+
Device Drivers  --->
+
  Graphics support  --->
+
      -*- Support for frame buffer devices  ---
+
        < >  Xen virtual frame buffer support
+
 
+
Device Drivers  --->
+
  Xen driver support  --->
+
      [*] Xen memory balloon driver (NEW)
+
      [*]  Scrub pages before returning them to system (NEW)
+
      <*> Xen /dev/xen/evtchn device (NEW)
+
      [*] Backend driver support (NEW)
+
      <*> Xen filesystem (NEW)
+
      [*]  Create compatibility mount point /proc/xen (NEW)
+
      [*] Create xen entries under /sys/hypervisor (NEW)
+
      <M> userspace grant access device driver (NEW)
+
      <M> User-space grant reference allocator driver (NEW)
+
      <M> xen platform pci device driver (NEW)
+
 
+
File systems  --->
+
  < > Ext3 journalling file system support
+
  <*> The Extended 4 (ext4) filesystem
+
  [*]  Use ext4 for ext2/ext3 file systems (NEW)
+
  [*]  Ext4 extended attributes (NEW)
+
 
}}
 
}}
{{Fancyimportant|Don't forget to add the required drivers for your networking and sata cards. If you use RAID, make sure to add the correct CONFIG_MD_RAID* entries to your config.}}
+
== Get to know the powerful UNIX editor ==
 
+
<console>
+
###i## make
+
###i## make modules_install
+
</console>
+
 
+
{{Fancynote| If you experience issues with connecting to the console ensure the module "xen_gntdev" (userspace grant access device driver) is loaded before the xenconsoled process is started (you may have to restart it after loading the module).}}
+
 
+
== Configuring Grub ==
+
Work has been completed to automatically enable Xen Grub entries, so after you copy your dom0 kernel edit your /etc/boot.conf as follows:
+
 
+
<pre>
+
"Funtoo on Xen" {
+
  type xen
+
  xenkernel xen.gz
+
  xenparams loglvl=all guest_loglvl=all xsave=1 iommu=1 iommu_inclusive_mapping=1 dom0_max_vcpus=2 dom0_vcpus_pin dom0_mem=4096M
+
  kernel kernel[-v]
+
  params += quiet
+
}
+
</pre>
+
 
+
{{Fancynote| iommu is the paravirtualized instructions, if your motherboard or CPU does not support VT-d do, not enable it. Xsave saves the supported CPU instruction sets -- without it you're dom0 kernel may not boot. dom0_vcpus_pin permanatly assigns cpu's to dom0 -- increasing performance.}}
+
 
+
== Basic Networking with the Dom0 ==
+
Funtoo Linux offers its own modular, template-based network configuration system. This system offers a lot of flexibility for configuring network interfaces, essentially serving as a "network interface construction kit."
+
 
+
We are going to set eth0 as the default interface to the outside world for now. eth1 will be part of a bridge (xenbr0) that is going to be used by various domU guests.
+
 
+
Construct the interfaces:
+
<console>
+
###i## cd /etc/init.d/
+
###i## ln -s netif.tmpl netif.xenbr0
+
###i## ln -s netif.tmpl netif.extbr0
+
###i## ln -s netif.tmpl netif.eth0
+
###i## ln -s netif.tmpl netif.eth1
+
###i## rc-update add netif.xenbr0 sysinit
+
###i## rc-update add netif.extbr0 sysinit
+
</console>
+
 
+
Make sure dhcpcd, eth0 and eth1 don't start at boot:
+
<console>
+
###i## rc-update del dhcpcd sysinit
+
###i## rc-update del netif.eth0 sysinit
+
###i## rc-update del netif.eth1 sysinit
+
</console>
+
 
+
Configure the slave interfaces:
+
<console>
+
###i## cd /etc/conf.d/
+
###i## echo 'template="interface-noip"' > netif.eth0
+
###i## echo 'template="interface-noip"' > netif.eth1
+
</console>
+
Now, we prepare the bridges:
+
<console>
+
###i## nano netif.xenbr0
+
</console>
+
here we set the internal Xen bridge by editing <tt>/etc/conf.d/netif.xenbr0</tt>:
+
 
+
<pre>
+
template="bridge"
+
ipaddr="10.0.1.200/24"
+
gateway="10.0.1.1"
+
nameservers="10.0.1.1 10.0.1.2"
+
domain="funtoo.org"
+
slaves="netif.eth0"
+
</pre>
+
 
+
Then, we set up the external interface:
+
<console>
+
###i## nano netif.extbr0
+
</console>
+
{{Fancynote| This will look quite similar. Please watch out for the correct slave setting!}}
+
 
+
Now, edit <tt>/etc/conf.d/netif.extbr0</tt>:
+
 
+
 
+
<pre>
+
template="bridge"
+
ipaddr="10.0.1.201/24"
+
gateway="10.0.1.1"
+
nameservers="10.0.1.1 10.0.1.2"
+
domain="funtoo.org"
+
slaves="netif.eth1"
+
</pre>
+
 
+
This gives us the possibility to play around with various setups later, it's modular and easy to tweak and change.
+
 
+
{{Fancytip| It is probably a good idea to try starting the interfaces with rc before rebooting.}}
+
 
+
== Basic Networking with domU ==
+
 
+
The easiest way is to let Xen set up the networking. But if everything is up and running it is not possible to change the routings, etc.
+
Letting Xen do the bridges will be obsolete in the near future. So this is not the recommended way anymore. As we already set up the bridges in the previous section it may be enough to comment everything network related. If not, just un-comment the last lines.
+
 
+
We edit the /etc/xen/xend-config.sxp
+
 
+
<pre>
+
#### Xen config from maiwald.tk - Xen 4.x Network in bridge mode
+
 
+
(logfile /var/log/xen/xend.log)
+
(loglevel DEBUG)
+
 
+
(xend-relocation-server no)
+
(xend-relocation-hosts-allow '^localhost$ ^localhost\\.localdomain$')
+
 
+
# The limit (in kilobytes) on the size of the console buffer
+
(console-limit 1024)
+
 
+
(dom0-min-mem 384)
+
(enable-dom0-ballooning no)
+
 
+
(total_available_memory 0)
+
(dom0-cpus 0)
+
 
+
(vncpasswd 'geheim')
+
 
+
# let xen create the net
+
# (network-script    network-bridge)
+
# (vif-script        vif-bridge)
+
 
+
# we create the net - new default in Xen 4
+
#
+
#(network-script 'network-bridge netdev=eth0 bridge=xenbr0 vifnum=0')
+
#(vif-script vif-bridge bridge=xenbr0)
+
</pre>
+
 
+
= Building the Funtoo Xen DomU Container =
+
 
+
We are going to build the DomU now, preparing first from outside the domU.
+
 
+
=== create lvm volume or partition or image file ===
+
 
+
''This is a stub, please help completing this guide here!''
+
 
+
<console>
+
###i## vgcreate vgxen /dev/sda3
+
###i## lvcreate -L10G -n funtoo_root vgxen
+
###i## lvcreate -L1G -n funtoo_swap vgxen
+
###i## vgchange -a y
+
###i## mkfs.ext4 -L funtoo_root /dev/vgxen/funtoo_root
+
###i## mkswap -L funtoo_swap /dev/vgxen/funtoo_swap
+
###i## rc-update add lvm boot
+
</console>
+
== Basic DomU System setup ==
+
=== mount domU lvm volume or physical partition or image file===
+
<console>
+
###i## mkdir /mnt/domu1
+
###i## mount /dev/vgxen/funtoo_root /mnt/domu1
+
###i## cd /mnt/domu1
+
</console>
+
 
+
=== get stage3 ===
+
from a funtoo mirror near you, I suggest you look at the funtoo homepage
+
 
+
<console>
+
###i## links http://www.funtoo.org/wiki/Download </console>
+
Then choose a mirror near you ( I use Heanet in EU ) and look for the right stage3. I use XEON CPUs so I take the core2 distrib:
+
+
<console>
+
###i## wget -cv http://ftp.heanet.ie/mirrors/funtoo/funtoo-stable/x86-64bit/core2_64/stage3-latest.tar.xz </console>
+
Unfortunately I can't find md5sums or similar which is really unpleasant.
+
 
+
=== Get latest portage tree from the snapshots firectory ===
+
  
<console>
+
=== Pick an editor ===
###i## wget -cv http://ftp.heanet.ie/mirrors/funtoo/funtoo-stable/snapshots/portage-current.tar.xz </console>
+
In the UNIX world, we have a lot of options when it comes to editing files. Think of it -- vi, emacs, and jed come to mind, as well as many others. We all have our favorite editor (along with our favorite keybindings) that we have come to know and love. With our trusty editor, we are ready to tackle any number of UNIX-related administration or programming tasks with ease.
=== Extract the stage3 ===
+
<console>
+
###i## tar xpf stage3-current.tar.xz
+
</console>
+
  
=== Extract Portage ===
+
While interactive editors are great, they do have limitations. Though their interactive nature can be a strength, it can also be a weakness. Consider a situation where you need to perform similar types of changes on a group of files. You could instinctively fire up your favorite editor and perform a bunch of mundane, repetitive, and time-consuming edits by hand. But there's a better way.
  
<console>
+
=== Enter sed ===
###i## cd usr
+
It would be nice if we could automate the process of making edits to files, so that we could "batch" edit files, or even write scripts with the ability to perform sophisticated changes to existing files. Fortunately for us, for these types of situations, there is a better way -- and the better way is called sed.
###i## tar xf ../portage-current.tar.xz
+
</console>
+
  
== Preparing the chroot environment ==
+
sed is a lightweight stream editor that's included with nearly all UNIX flavors, including Linux. sed has a lot of nice features. First of all, it's very lightweight, typically many times smaller than your favorite scripting language. Secondly, because sed is a stream editor, it can perform edits to data it receives from stdin, such as from a pipeline. So, you don't need to have the data to be edited stored in a file on disk. Because data can just as easily be piped to sed, it's very easy to use sed as part of a long, complex pipeline in a powerful shell script. Try doing that with your favorite editor.
=== Editing the make.conf ===
+
copy the <tt>/etc/make.conf</tt> from dom0 and adjust it:
+
  
<console>
+
=== GNU sed ===
###i## cp /etc/portage/make.conf /mnt/domu1/etc/
+
Fortunately for us Linux users, one of the nicest versions of sed out there happens to be GNU sed. Every Linux distribution has GNU sed, or at least should. GNU sed is popular not only because its sources are freely distributable, but because it happens to have a lot of handy, time-saving extensions to the POSIX sed standard. GNU sed also doesn't suffer from many of the limitations that earlier and proprietary versions of sed had, such as a limited line length -- GNU sed handles lines of any length with ease.
</console>
+
  
make sure to adjust MAKEOPTS to your assigned CPUs (rule of thumb: cpu cores +1 - yes, even in XEN)
+
=== The right sed ===
<console>
+
In this series, we will be using GNU sed. Some (but very few) of the most advanced examples you'll find in my upcoming, follow-on articles in this series will not work with GNU sed 3.02 or 3.02a and will require a modern version. If you're using a non-GNU sed, your results may vary. Why not take some time to install GNU sed now (see [[#Resources|Resources]] for source code)? Then, not only will you be ready for the rest of the series, but you'll also be able to use arguably the best sed in existence!
###i## nano -w /mnt/domu1/etc/portage/make.conf
+
</console>
+
out there the MAKEOPTS variable in:
+
<pre>
+
MAKEOPTS="-j2"
+
</pre>
+
  
=== Copy <tt>/etc/resolv.conf</tt> ===  
+
=== Sed examples ===
<console>
+
Sed works by performing any number of user-specified editing operations ("commands") on the input data. Sed is line-based, so the commands are performed on each line in order. And, sed writes its results to standard output (stdout); it doesn't modify any input files.
###i## cp -L /etc/resolv.conf /mnt/domu1/etc/
+
</console>
+
  
=== mount proc and dev ===
+
Let's look at some examples. The first several are going to be a bit weird because I'm using them to illustrate how sed works rather than to perform any useful task. However, if you're new to sed, it's very important that you understand them. Here's our first example:
<console>
+
###i## mount -t proc none /mnt/domu1/proc
+
###i## mount --rbind /dev /mnt/domu1/dev
+
</console>
+
  
== Building Funtoo Xen Guest(s) DomU ==
+
<console>$##i## sed -e 'd' /etc/services</console>
  
== Final DomU System setup ==
+
If you type this command, you'll get absolutely no output. Now, what happened? In this example, we called sed with one editing command, <span style="color:green">d</span>. Sed opened the '''/etc/services''' file, read a line into its pattern buffer, performed our editing command ("delete line"), and then printed the pattern buffer (which was empty). It then repeated these steps for each successive line. This produced no output, because the <span style="color:green">d</span> command zapped every single line in the pattern buffer!
=== chroot ===
+
<console>
+
###i## chroot /mnt/domu1 /bin/bash
+
###i## env-update
+
###i## source /etc/profile
+
###i## export PS1="(domU-chroot) $PS1"
+
</console>
+
  
=== sync portage ===
+
There are a couple of things to notice in this example. First, '''/etc/services''' was not modified at all. This is because, again, sed only reads from the file you specify on the command line, using it as input -- it doesn't try to modify the file. The second thing to notice is that sed is line-oriented. The <span style="color:green">d</span> command didn't simply tell sed to delete all incoming data in one fell swoop. Instead, sed read each line of /etc/services one by one into its internal buffer, called the pattern buffer. Once a line was read into the pattern buffer, it performed the <span style="color:green">d</span> command and printed the contents of the pattern buffer (nothing in this example). Later, I'll show you how to use address ranges to control which lines a command is applied to -- but in the absence of addresses, a command is applied to all lines.
<console>  
+
###i## emerge --sync
+
</console>
+
  
=== set locales ===
+
The third thing to notice is the use of single quotes to surround the d command. It's a good idea to get into the habit of using single quotes to surround your sed commands, so that shell expansion is disabled.
<console>
+
###i## nano -w /etc/locale.gen
+
###i## locale-gen
+
</console>
+
  
=== Set your timezone ===  
+
=== Another sed example ===
(choose your timezone in <tt>/usr/share/zoneinfo</tt>)
+
Here's an example of how to use sed to remove the first line of the '''/etc/services''' file from our output stream:
<console>
+
###i## ln -v -sf /usr/share/zoneinfo/Europe/Amsterdam /etc/localtime
+
</console>
+
  
=== Edit <tt>/etc/fstab</tt> (see also gentoo handbook as reference) ===
+
<console>$##i## sed -e '1d' /etc/services | more</console>
We assume that we name our root partition <tt>xvda1</tt> and the swap partition <tt>xvda2</tt> in our <tt>domU-xen-</tt> config (we will do that later)
+
<console>
+
###i## nano -w /etc/fstab
+
</console>
+
  
<pre>
+
As you can see, this command is very similar to our first <span style="color:green">d</span> command, except that it is preceded by a 1. If you guessed that the 1 refers to line number one, you're right. While in our first example, we used d by itself, this time we use the <span style="color:green">d</span> command preceded by an optional numerical address. By using addresses, you can tell sed to perform edits only on a particular line or lines.
/dev/xvda1      /              ext4    noatime 0 1
+
/dev/xvda2      none          swap    sw      0 0
+
shm            /dev/shm      tmpfs  nodev,nosuid,noexec    0 0
+
</pre>
+
  
=== The most important stuff ===  
+
=== Address ranges ===
Copy this into your terminal:
+
Now, let's look at how to specify an address range. In this example, sed will delete lines 1-10 of the output:
  
<pre>
+
<console>$##i## sed -e '1,10d' /etc/services | more</console>
echo '
+
                        Larry loves Funtoo
+
                      _________________________
+
                      < Have you mooed today? >
+
                      -------------------------
+
                        \  ^__^
+
                        \  (oo)\_______
+
                            (__)\      )\/\
+
                                ||----w |
+
                                ||    ||
+
.::::::::::::::: WELCOME TO ^^^^^^^^^^^^^^^^^^^:::::::::::::..
+
...............................................................
+
:########:'##::::'##:'##::: ##:'########::'#######:::'#######::.
+
:##.....:: ##:::: ##: ###:: ##:... ##..::'##.... ##:'##.... ##::
+
:##::::::: ##:::: ##: ####: ##:::: ##:::: ##:::: ##: ##:::: ##::
+
:######::: ##:::: ##: ## ## ##:::: ##:::: ##:::: ##: ##:::: ##::
+
:##...:::: ##:::: ##: ##. ####:::: ##:::: ##:::: ##: ##:::: ##::
+
:##::::::: ##:::: ##: ##:. ###:::: ##:::: ##:::: ##: ##:::: ##::
+
:##:::::::. #######:: ##::. ##:::: ##::::. #######::. #######::′
+
.::::::::::.......:::..::::..:::::..::::::.......::::.......::´
+
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+
'> /etc/motd
+
</pre>
+
We are using the echo instead of „emerge --moo „ as larry still moo's in gentoo'ish
+
  
So that's it - almost.
+
When we separate two addresses by a comma, sed will apply the following command to the range that starts with the first address, and ends with the second address. In this example, the <span style="color:green">d</span> command was applied to lines 1-10, inclusive. All other lines were ignored.
  
==== Adding networking to the domU: ====
+
=== Addresses with regular expressions ===
 +
Now, it's time for a more useful example. Let's say you wanted to view the contents of your '''/etc/services''' file, but you aren't interested in viewing any of the included comments. As you know, you can place comments in your '''/etc/services''' file by starting the line with the '#' character. To avoid comments, we'd like sed to delete lines that start with a '#'. Here's how to do it:
  
<console>
+
<console>$##i## sed -e '/^#/d' /etc/services | more</console>
(domU-chroot) ###i## cd /etc/init.d/
+
(domU-chroot) ###i## ln -sf netif.tmpl netif.eth0
+
(domU-chroot) ###i## rc-update add netif.eth0
+
* service netif.eth0 added to runlevel sysinit
+
</console>
+
  
==== Now we are ready for the final setups ====
+
Try this example and see what happens. You'll notice that sed performs its desired task with flying colors. Now, let's figure out what happened.
<console>
+
(domU-chroot) ###i## emerge eix
+
(domU-chroot) ###i## eix-update
+
Reading Portage settings ..
+
Building database (/var/cache/eix) ..
+
[0] "gentoo" /usr/portage/ (cache: metadata-md5-or-flat)
+
    Reading category 154|154 (100%) Finished           
+
Applying masks ..
+
Calculating hash tables ..
+
Writing database file /var/cache/eix ..
+
Database contains 15729 packages in 154 categories.
+
  
(domU-chroot) # exit
+
To understand the '/^#/d' command, we first need to dissect it. First, let's remove the 'd' -- we're using the same delete line command that we've used previously. The new addition is the '/^#/' part, which is a new kind of regular expression address. Regular expression addresses are always surrounded by slashes. They specify a pattern, and the command that immediately follows a regular expression address will only be applied to a line if it happens to match this particular pattern.
exit
+
</console>
+
  
From here you have to decide how you want to run your domU: with unpriviledged users and sudo or with a root account enabled or as a webserver or firewall.
+
So, '/^#/' is a regular expression. But what does it do? Obviously, this would be a good time for a regular expression refresher.
  
I always install the openssh server and just place my ssh keys in there. From there the steps differ.
+
=== Regular expression refresher ===
 +
We can use regular expressions to express patterns that we may find in the text. If you've ever used the '*' character on the shell command line, you've used something that's similar, but not identical to, regular expressions. Here are the special characters that you can use in regular expressions:
 +
{| border=1
 +
|-
 +
|'''Character'''
 +
|'''Description'''
 +
|-
 +
|^
 +
|Matches the beginning of the line
 +
|-
 +
|$
 +
|Matches the end of the line
 +
|-
 +
|.
 +
|Matches any single character
 +
|-
 +
|*
 +
|Will match zero or more occurrences of the previous character
 +
|-
 +
|[ ]
 +
|Matches all the characters inside the [ ]
 +
|}
  
<console>
+
Probably the best way to get your feet wet with regular expressions is to see a few examples. All of these examples will be accepted by sed as valid addresses to appear on the left side of a command. Here are a few:
(dom0-xen) ###i## cp /root/.ssh/authorized_keys /mnt/domu1/root/.ssh/
+
{| border=1
</console>
+
|-
Also, don't forget to enable PubKeyAuth in your sshd_config in your domU and set <tt>PermitRootLogin</tt> to yes!
+
|'''Regular expression'''
 +
|'''Description'''
 +
|-
 +
|/./
 +
|Will match any line that contains at least one character
 +
|-
 +
|/../
 +
|Will match any line that contains at least two characters
 +
|-
 +
|/^#/
 +
|Will match any line that begins with a '#'
 +
|-
 +
|/^$/
 +
|Will match all blank lines
 +
|-
 +
|/}$/
 +
|Will match any lines that ends with '}' (no spaces)
 +
|-
 +
|/} *$/
 +
|Will match any line ending with '}' followed by zero or more spaces
 +
|-
 +
|/[abc]/
 +
|Will match any line that contains a lowercase 'a', 'b', or 'c'
 +
|-
 +
|/^[abc]/
 +
|Will match any line that begins with an 'a', 'b', or 'c'
 +
|}
 +
I encourage you to try several of these examples. Take some time to get familiar with regular expressions, and try a few regular expressions of your own creation. You can use a regexp this way:
  
'''Double checking''': Does your domU use kernel modules or not? If you haven't built a monolitic kernel you should copy the modules from the dom0 to the domU now:
+
<console>$##i## sed -e '/regexp/d' /path/to/my/test/file | more</console>
<console>
+
(dom0-xen) ###i## mkdir /mnt/domu1/lib/modules
+
(dom0-xen) ###i## rsync -aP /lib/modules/2.6.38-xen-maiwald.tk-dom0 /mnt/domu1/lib/modules/
+
</console>
+
  
Don't forget to clean up the mounts!
+
This will cause sed to delete any matching lines. However, it may be easier to get familiar with regular expressions by telling sed to print regexp matches, and delete non-matches, rather than the other way around. This can be done with the following command:
  
<console>
+
<console>$##i## sed -n -e '/regexp/p' /path/to/my/test/file | more</console>
(dom0-xen) ###i## cd
+
(dom0-xen) ###i## umount -l /mnt/domu1/proc
+
(dom0-xen) ###i## umount -l /mnt/domu1/dev
+
(dom0-xen) ###i## umount -l /mnt/domu1
+
</console>
+
  
=== Booting the Xen DomU Guest ===
+
Note the new '-n' option, which tells sed to not print the pattern space unless explicitly commanded to do so. You'll also notice that we've replaced the <span style="color:green">d</span> command with the <span style="color:green">p</span> command, which as you might guess, explicitly commands sed to print the pattern space. Voila, now only matches will be printed.
  
Ok, let's try the first boot of the newly created Xen DomU in Funtoo!
+
=== More on addresses ===
 +
Up till now, we've taken a look at line addresses, line range addresses, and regexp addresses. But there are even more possibilities. We can specify two regular expressions separated by a comma, and sed will match all lines starting from the first line that matches the first regular expression, up to and including the line that matches the second regular expression. For example, the following command will print out a block of text that begins with a line containing "BEGIN", and ending with a line that contains "END":
  
<console>
+
<console>$##i## sed -n -e '/BEGIN/,/END/p' /my/test/file | more</console>
(dom0-xen) ###i## cd /xen
+
(dom0-xen) ###i## xm create -c configs/funtoo.cfg
+
</console>
+
Huuuuiiiii.....
+
<pre>
+
Using config file "./configs/funtoo.cfg".
+
Started domain funtoo (id=4)
+
[    0.000000] Linux version 2.6.38-xen-maiwald.tk-domU (root@xen) (gcc version 4.4.5 (Gentoo 4.4.5 p1.0, pie-0.4.5) ) #4 SMP Wed Feb 8 17:30:33 CET 2012
+
[    0.000000] Command line: root=/dev/xvda1 ro ip=217.x.x.211:127.0.255.255:217.x.x.1:255.255.255.0:domU:eth0:off xencons=tty console=xvc0 raid=noautodetect
+
[    0.000000] Xen-provided physical RAM map:
+
[    0.000000]  Xen: 0000000000000000 - 0000000040800000 (usable)
+
[    0.000000] NX (Execute Disable) protection: active
+
[    0.000000] last_pfn = 0x40800 max_arch_pfn = 0x80000000
+
[    0.000000] init_memory_mapping: 0000000000000000-0000000040800000
+
[    0.000000] Zone PFN ranges:
+
[    0.000000]  DMA      0x00000000 -> 0x00001000
+
[    0.000000]  DMA32    0x00001000 -> 0x00100000
+
[    0.000000]  Normal  empty
+
[    0.000000] Movable zone start PFN for each node
+
[    0.000000] early_node_map[2] active PFN ranges
+
[    0.000000]    0: 0x00000000 -> 0x00040000
+
[    0.000000]    0: 0x00040800 -> 0x00040800
+
[    0.000000] setup_percpu: NR_CPUS:16 nr_cpumask_bits:16 nr_cpu_ids:1 nr_node_ids:1
+
[    0.000000] PERCPU: Embedded 18 pages/cpu @ffff88003efc0000 s42304 r8192 d23232 u73728
+
[    0.000000] Swapping MFNs for PFN 6d6 and 3efc7 (MFN 15deb0 and 1223bf)
+
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 256109
+
[    0.000000] Kernel command line: root=/dev/xvda1 ro ip=217.171.190.211:127.0.255.255:217.171.190.1:255.255.255.0:alyx1:eth0:off xencons=tty console=xvc0 raid=noautodetect
+
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
+
[    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
+
[    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
+
[    0.000000] Software IO TLB disabled
+
[    0.000000] Memory: 1022732k/1056768k available (3657k kernel code, 8192k absent, 25844k reserved, 1261k data, 264k init)
+
[    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
+
[    0.000000] Hierarchical RCU implementation.
+
[    0.000000] NR_IRQS:96
+
[    0.000000] Xen reported: 2992.570 MHz processor.
+
[    0.000000] Console: colour dummy device 80x25
+
[    0.000000] console [tty-1] enabled
+
[    0.150003] Calibrating delay using timer specific routine.. 6018.63 BogoMIPS (lpj=30093193)
+
[    0.150008] pid_max: default: 32768 minimum: 301
+
[    0.150034] Mount-cache hash table entries: 256
+
[    0.150173] SMP alternatives: switching to UP code
+
[    0.170232] Freeing SMP alternatives: 20k freed
+
[    0.170342] Brought up 1 CPUs
+
[    0.170377] devtmpfs: initialized
+
[    0.170601] xor: automatically using best checksumming function: generic_sse
+
[    0.220004]    generic_sse:  7325.200 MB/sec
+
[    0.220008] xor: using function: generic_sse (7325.200 MB/sec)
+
[    0.220091] NET: Registered protocol family 16
+
[    0.220186] Brought up 1 CPUs
+
[    0.220217] bio: create slab <bio-0> at 0
+
[    0.390014] raid6: int64x1  2353 MB/s
+
[    0.560003] raid6: int64x2  2964 MB/s
+
[    0.730026] raid6: int64x4  2357 MB/s
+
[    0.900012] raid6: int64x8  2116 MB/s
+
[    1.070007] raid6: sse2x1    5349 MB/s
+
[    1.240009] raid6: sse2x2    5404 MB/s
+
[    1.410005] raid6: sse2x4    8597 MB/s
+
[    1.410008] raid6: using algorithm sse2x4 (8597 MB/s)
+
[    1.410022] suspend: event channel 6
+
[    1.410022] xen_mem: Initialising balloon driver.
+
[    1.410096] Switching to clocksource xen
+
[    1.410125] FS-Cache: Loaded
+
[    1.410152] CacheFiles: Loaded
+
[    1.410268] NET: Registered protocol family 2
+
[    1.410288] IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
+
[    1.410391] TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
+
[    1.410951] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
+
[    1.411180] TCP: Hash tables configured (established 131072 bind 65536)
+
[    1.411183] TCP reno registered
+
[    1.411186] UDP hash table entries: 512 (order: 2, 16384 bytes)
+
[    1.411192] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
+
[    1.411229] NET: Registered protocol family 1
+
[    1.411290] platform rtc_cmos: registered platform RTC device (no PNP device found)
+
[    1.411401] Intel AES-NI instructions are not detected.
+
[    1.411437] audit: initializing netlink socket (disabled)
+
[    1.411444] type=2000 audit(1330014455.606:1): initialized
+
[    1.412612] fuse init (API version 7.16)
+
[    1.412674] msgmni has been set to 2048
+
[    1.412990] NET: Registered protocol family 38
+
[    1.413018] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
+
[    1.413024] io scheduler noop registered (default)
+
[    1.413026] io scheduler deadline registered
+
[    1.413049] io scheduler cfq registered
+
[    1.413079] Non-volatile memory driver v1.3
+
[    1.413088] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
+
[    1.413090] Hangcheck: Using getrawmonotonic().
+
[    1.419520] Switched to NOHz mode on CPU #0
+
[    1.423394] brd: module loaded
+
[    1.423665] loop: module loaded
+
[    1.423771] nbd: registered device at major 43
+
[    1.426180] Xen virtual console successfully installed as tty1
+
[    1.426216] Event-channel device installed.
+
[    1.441658] netfront: Initialising virtual ethernet driver.
+
[    1.444972] xen-vbd: registered block device major 202
+
[    1.444988] blkfront: xvda1: barriers enabled
+
[    1.450287] Setting capacity to 20971520
+
[    1.450294] xvda1: detected capacity change from 0 to 10737418240
+
[    1.450677] blkfront: xvda2: barriers enabled
+
[    1.451661] Setting capacity to 2097152
+
[    1.451665] xvda2: detected capacity change from 0 to 1073741824
+
[    1.452020] bonding: Ethernet Channel Bonding Driver: v3.7.0 (June 2, 2010)
+
[    1.452023] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
+
[    1.453016] i8042: No controller found
+
[    1.453066] mousedev: PS/2 mouse device common for all mice
+
[    1.453113] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
+
[    1.453145] rtc_cmos: probe of rtc_cmos failed with error -38
+
[    1.453155] md: linear personality registered for level -1
+
[    1.453158] md: raid0 personality registered for level 0
+
[    1.453161] md: raid1 personality registered for level 1
+
[    1.453163] md: raid6 personality registered for level 6
+
[    1.453166] md: raid5 personality registered for level 5
+
[    1.453168] md: raid4 personality registered for level 4
+
[    1.453224] device-mapper: uevent: version 1.0.3
+
[    1.453273] device-mapper: ioctl: 4.19.1-ioctl (2011-01-07) initialised: dm-devel@redhat.com
+
[    1.453340] device-mapper: multipath: version 1.2.0 loaded
+
[    1.453343] device-mapper: multipath round-robin: version 1.0.0 loaded
+
[    1.453345] device-mapper: multipath queue-length: version 0.1.0 loaded
+
[    1.453347] device-mapper: multipath service-time: version 0.2.0 loaded
+
[    1.453396] Netfilter messages via NETLINK v0.30.
+
[    1.453410] nf_conntrack version 0.5.0 (8192 buckets, 32768 max)
+
[    1.453478] ctnetlink v0.93: registering with nfnetlink.
+
[    1.453486] IPv4 over IPv4 tunneling driver
+
[    1.453548] TCP westwood registered
+
[    1.453550] TCP highspeed registered
+
[    1.453552] TCP htcp registered
+
[    1.453553] TCP vegas registered
+
[    1.453555] Initializing XFRM netlink socket
+
[    1.453630] NET: Registered protocol family 10
+
[    1.453803] IPv6 over IPv4 tunneling driver
+
[    1.453863] NET: Registered protocol family 17
+
[    1.453868] NET: Registered protocol family 15
+
[    1.453870] Registering the dns_resolver key type
+
[    1.550094] /usr/src/linux-2.6.38-xen/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
+
[    3.070104] IP-Config: Complete:
+
[    3.070109]      device=eth0, addr=217.171.190.211, mask=255.255.255.0, gw=217.171.190.1,
+
[    3.070116]      host=alyx1, domain=, nis-domain=(none),
+
[    3.070119]      bootserver=127.0.255.255, rootserver=127.0.255.255, rootpath=
+
[    3.070212] md: Skipping autodetection of RAID arrays. (raid=autodetect will force)
+
[    3.107309] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
+
[    3.107321] VFS: Mounted root (ext2 filesystem) readonly on device 202:1.
+
[    3.140059] devtmpfs: mounted
+
[    3.140239] Freeing unused kernel memory: 264k freed
+
INIT: version 2.88 booting
+
  
  OpenRC 0.8.3 is starting up Funtoo Linux (x86_64)
+
If "BEGIN" isn't found, no data will be printed. And, if "BEGIN" is found, but no "END" is found on any line below it, all subsequent lines will be printed. This happens because of sed's stream-oriented nature -- it doesn't know whether or not an "END" will appear.
  
* Mounting /proc ...
+
=== C source example ===
[ ok ]
+
If you want to print out only the main() function in a C source file, you could type:
* WARNING: rc_sys not defined in rc.conf. Falling back to automatic detection
+
* Caching service dependencies ...
+
[ ok ]
+
* Mounting /sys ...
+
[ ok ]
+
* udev: /dev already mounted, skipping...
+
* Mounting /dev/pts ...
+
[ ok ]
+
* Mounting /dev/shm ...
+
[ ok ]
+
* Bringing up network interface lo ...
+
RTNETLINK answers: File exists
+
[ ok ]
+
* Bringing up network interface lo ...
+
RTNETLINK answers: File exists
+
RTNETLINK answers: File exists
+
[ ok ]
+
* Starting udevd daemon ...
+
* Populating /dev with existing devices through uevents ...
+
[ ok ]
+
* Autoloaded 0 module(s)
+
* Checking local filesystems  ...
+
funtoo_root: Superblock last write time is in the future.
+
        (by less than a day, probably due to the hardware clock being incorrectly set).  FIXED.
+
funtoo_root: clean, 173796/655360 files, 436917/2621440 blocks
+
[ ok ]
+
* Remounting root filesystem read/write ...
+
[ ok ]
+
* Updating /etc/mtab ...
+
[ ok ]
+
* Mounting local filesystems ...
+
[ ok ]
+
* Configuring kernel parameters ...
+
[ ok ]
+
* Creating user login records ...
+
[ ok ]
+
* Cleaning /var/run ...
+
[ ok ]
+
* Wiping /tmp directory ...
+
[ ok ]
+
* Setting hostname to localhost ...
+
[ ok ]
+
* Activating swap devices ...
+
[ ok ]
+
* udev: storing persistent rules ...
+
[ ok ]
+
* Initializing random number generator ...
+
[ ok ]
+
INIT: Entering runlevel: 3
+
* Mounting network filesystems ...
+
[ ok ]
+
* Generating dsa host key ...
+
Generating public/private dsa key pair.
+
Your identification has been saved in /etc/ssh/ssh_host_dsa_key.
+
Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub.
+
The key fingerprint is:
+
25:e0:a8:05:xxxxxxxxxxxx:1c:1f:ba root@localhost
+
The key's randomart image is:
+
+--[ DSA 1024]----+
+
|  ooo.B.o        |
+
| o o *.B o .    |
+
|  . + + = =      |
+
|  o  + *      |
+
|  .  E S        |
+
|                |
+
|                |
+
|                |
+
|                |
+
+-----------------+
+
[ ok ]
+
* Generating rsa host key ...
+
Generating public/private rsa key pair.
+
Your identification has been saved in /etc/ssh/ssh_host_rsa_key.
+
Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub.
+
The key fingerprint is:
+
22:e3:46:28:67:xxxxxxxxxxxxxxxxxxxxx:e5:c3 root@localhost
+
The key's randomart image is:
+
+--[ RSA 2048]----+
+
|.    o. ..      |
+
|oo  o ..o        |
+
|=oo  o  E      |
+
|.*oo.    .      |
+
|o *.+ . S        |
+
| + o o .        |
+
|    o            |
+
|  .            |
+
|                |
+
+-----------------+
+
[ ok ]
+
* Starting sshd ...
+
[ ok ]
+
* Starting local
+
[ ok ]
+
  
 +
<console>$##i## sed -n -e '/main[[:space:]]*(/,/^}/p' sourcefile.c | more</console>
  
                        Larry loves Funtoo
+
This command has two regular expressions, <nowiki>'/main[[:space:]]*(/' and '/^}/'</nowiki>, and one command, <span style="color:green">p</span>. The first regular expression will match the string "main" followed by any number of spaces or tabs, followed by an open parenthesis. This should match the start of your average ANSI C main() declaration.
                      _________________________
+
                      < Have you mooed today? >
+
                      -------------------------
+
                          ^__^
+
                          (oo)_______
+
                            (__)      )/
+
                                ||----w |
+
                                ||    ||
+
.::::::::::::::::::::: WELCOME TO ::::::::::::::::::::::::::..
+
...............................................................
+
:########:'##::::'##:'##::: ##:'########::'#######:::'#######::.
+
:##.....:: ##:::: ##: ###:: ##:... ##..::'##.... ##:'##.... ##::
+
:##::::::: ##:::: ##: ####: ##:::: ##:::: ##:::: ##: ##:::: ##::
+
:######::: ##:::: ##: ## ## ##:::: ##:::: ##:::: ##: ##:::: ##::
+
:##...:::: ##:::: ##: ##. ####:::: ##:::: ##:::: ##: ##:::: ##::
+
:##::::::: ##:::: ##: ##:. ###:::: ##:::: ##:::: ##: ##:::: ##::
+
:##:::::::. #######:: ##::. ##:::: ##::::. #######::. #######::′
+
.::::::::::.......:::..::::..:::::..::::::.......::::.......::´
+
This is localhost.unknown_domain (Linux x86_64 2.6.38-xen-maiwald.tk-domU) 17:27:40
+
  
localhost login:  
+
<nowiki>In this particular regular expression, we encounter the '[[:space:]]' character class. This is simply a special keyword that tells sed to match either a TAB or a space. If you wanted, instead of typing '[[:space:]]', you could have typed '[', then a literal space, then Control-V, then a literal tab and a ']' -- The Control-V tells bash that you want to insert a "real" tab rather than perform command expansion. It's clearer, especially in scripts, to use the '[[:space:]]' command class.</nowiki>
</pre>
+
  
=== Finalizing the setup ===
+
OK, now on to the second regexp. '/^}/' will match a '}' character that appears at the beginning of a new line. If your code is formatted nicely, this will match the closing brace of your main() function. If it's not, it won't -- one of the tricky things about performing pattern matching.
Now we test if we can reach the DomU from our Desktop:
+
<console>
+
(2034)-~% ssh -lroot 217.x.x.211 
+
The authenticity of host '217.x.x.211 (217.x.x.211)' can't be established.
+
RSA key fingerprint is 22:e3:xxxxxxxx:b0:3c:xxxxx:d6:e5:c3.
+
Are you sure you want to continue connecting (yes/no)? yes
+
Warning: Permanently added '217.x.x.211' (RSA) to the list of known hosts.
+
Enter passphrase for key '/home/mm/.ssh/id_rsa':
+
localhost ~ # uname -a
+
Linux localhost 2.6.38-xen-maiwald.tk-domU #4 SMP Wed Feb 8 17:30:33 CET 2012 x86_64 Intel(R) Xeon(R) CPU E3110 @ 3.00GHz GenuineIntel GNU/Linux
+
localhost ~ #
+
</console>
+
  
Now switch back to the Funtoo [[Installation (Tutorial)|Installation Tutorial]] and go on with setting up your new domU guest like a normal funtoo linux system!
+
The <span style="color:green">p</span> command does what it always does, explicitly telling sed to print out the line, since we are in '-n' quiet mode. Try running the command on a C source file -- it should output the entire main() { } block, including the initial "main()" and the closing '}'.
  
'''Please consider supporting this Wiki by editing this page and keeping it current!'''
+
=== Next time ===
 +
Now that we've touched on the basics, we'll be picking up the pace for the next two articles. If you're in the mood for some meatier sed material, be patient -- it's coming! In the meantime, you might want to check out the following sed and regular expression resources.
  
Funtoo is a perfect Xen Host and I recommend it to everybody as an alternative to .deb/.rpm Systems.
+
== Resources ==
 +
* Read Daniel's other sed articles: Sed by example, [[Sed by Example, Part 2|Part 2]] and [[Sed by Example, Part 3|Part 3]].
 +
* Check out Eric Pement's excellent [http://sed.sourceforge.net/sedfaq.html sed FAQ].
 +
* You can find the sources to sed at ftp://ftp.gnu.org/pub/gnu/sed.
 +
* Eric Pement also has a handy list of [http://sed.sourceforge.net/sed1line.txt sed one-liners] that any aspiring sed guru should definitely look at.
 +
* If you'd like a good old-fashioned book, [http://www.oreilly.com/catalog/sed2/ O'Reilly's sed & awk, 2nd Edition] would be wonderful choice.
 +
* See the regular expressions [http://docs.python.org/dev/howto/regex.html how-to document] from [http://python.org/ python.org].
 +
* Refer to an [http://www.uky.edu/ArtsSciences/Classics/regex.html overview of regular expressions] from the University of Kentucky.
  
Have fun!
+
__NOTOC__
[[Category:Virtualization]]
+
[[Category:Linux Core Concepts]]
 +
[[Category:Articles]]
 +
{{ArticleFooter}}

Latest revision as of 09:21, December 28, 2014


Next in series: Sed by Example, Part 2

Support Funtoo and help us grow! Donate $15 per month and get a free SSD-based Funtoo Virtual Container. 3 spots left.

Get to know the powerful UNIX editor

Pick an editor

In the UNIX world, we have a lot of options when it comes to editing files. Think of it -- vi, emacs, and jed come to mind, as well as many others. We all have our favorite editor (along with our favorite keybindings) that we have come to know and love. With our trusty editor, we are ready to tackle any number of UNIX-related administration or programming tasks with ease.

While interactive editors are great, they do have limitations. Though their interactive nature can be a strength, it can also be a weakness. Consider a situation where you need to perform similar types of changes on a group of files. You could instinctively fire up your favorite editor and perform a bunch of mundane, repetitive, and time-consuming edits by hand. But there's a better way.

Enter sed

It would be nice if we could automate the process of making edits to files, so that we could "batch" edit files, or even write scripts with the ability to perform sophisticated changes to existing files. Fortunately for us, for these types of situations, there is a better way -- and the better way is called sed.

sed is a lightweight stream editor that's included with nearly all UNIX flavors, including Linux. sed has a lot of nice features. First of all, it's very lightweight, typically many times smaller than your favorite scripting language. Secondly, because sed is a stream editor, it can perform edits to data it receives from stdin, such as from a pipeline. So, you don't need to have the data to be edited stored in a file on disk. Because data can just as easily be piped to sed, it's very easy to use sed as part of a long, complex pipeline in a powerful shell script. Try doing that with your favorite editor.

GNU sed

Fortunately for us Linux users, one of the nicest versions of sed out there happens to be GNU sed. Every Linux distribution has GNU sed, or at least should. GNU sed is popular not only because its sources are freely distributable, but because it happens to have a lot of handy, time-saving extensions to the POSIX sed standard. GNU sed also doesn't suffer from many of the limitations that earlier and proprietary versions of sed had, such as a limited line length -- GNU sed handles lines of any length with ease.

The right sed

In this series, we will be using GNU sed. Some (but very few) of the most advanced examples you'll find in my upcoming, follow-on articles in this series will not work with GNU sed 3.02 or 3.02a and will require a modern version. If you're using a non-GNU sed, your results may vary. Why not take some time to install GNU sed now (see Resources for source code)? Then, not only will you be ready for the rest of the series, but you'll also be able to use arguably the best sed in existence!

Sed examples

Sed works by performing any number of user-specified editing operations ("commands") on the input data. Sed is line-based, so the commands are performed on each line in order. And, sed writes its results to standard output (stdout); it doesn't modify any input files.

Let's look at some examples. The first several are going to be a bit weird because I'm using them to illustrate how sed works rather than to perform any useful task. However, if you're new to sed, it's very important that you understand them. Here's our first example:

$ sed -e 'd' /etc/services

If you type this command, you'll get absolutely no output. Now, what happened? In this example, we called sed with one editing command, d. Sed opened the /etc/services file, read a line into its pattern buffer, performed our editing command ("delete line"), and then printed the pattern buffer (which was empty). It then repeated these steps for each successive line. This produced no output, because the d command zapped every single line in the pattern buffer!

There are a couple of things to notice in this example. First, /etc/services was not modified at all. This is because, again, sed only reads from the file you specify on the command line, using it as input -- it doesn't try to modify the file. The second thing to notice is that sed is line-oriented. The d command didn't simply tell sed to delete all incoming data in one fell swoop. Instead, sed read each line of /etc/services one by one into its internal buffer, called the pattern buffer. Once a line was read into the pattern buffer, it performed the d command and printed the contents of the pattern buffer (nothing in this example). Later, I'll show you how to use address ranges to control which lines a command is applied to -- but in the absence of addresses, a command is applied to all lines.

The third thing to notice is the use of single quotes to surround the d command. It's a good idea to get into the habit of using single quotes to surround your sed commands, so that shell expansion is disabled.

Another sed example

Here's an example of how to use sed to remove the first line of the /etc/services file from our output stream:

$ sed -e '1d' /etc/services | more

As you can see, this command is very similar to our first d command, except that it is preceded by a 1. If you guessed that the 1 refers to line number one, you're right. While in our first example, we used d by itself, this time we use the d command preceded by an optional numerical address. By using addresses, you can tell sed to perform edits only on a particular line or lines.

Address ranges

Now, let's look at how to specify an address range. In this example, sed will delete lines 1-10 of the output:

$ sed -e '1,10d' /etc/services | more

When we separate two addresses by a comma, sed will apply the following command to the range that starts with the first address, and ends with the second address. In this example, the d command was applied to lines 1-10, inclusive. All other lines were ignored.

Addresses with regular expressions

Now, it's time for a more useful example. Let's say you wanted to view the contents of your /etc/services file, but you aren't interested in viewing any of the included comments. As you know, you can place comments in your /etc/services file by starting the line with the '#' character. To avoid comments, we'd like sed to delete lines that start with a '#'. Here's how to do it:

$ sed -e '/^#/d' /etc/services | more

Try this example and see what happens. You'll notice that sed performs its desired task with flying colors. Now, let's figure out what happened.

To understand the '/^#/d' command, we first need to dissect it. First, let's remove the 'd' -- we're using the same delete line command that we've used previously. The new addition is the '/^#/' part, which is a new kind of regular expression address. Regular expression addresses are always surrounded by slashes. They specify a pattern, and the command that immediately follows a regular expression address will only be applied to a line if it happens to match this particular pattern.

So, '/^#/' is a regular expression. But what does it do? Obviously, this would be a good time for a regular expression refresher.

Regular expression refresher

We can use regular expressions to express patterns that we may find in the text. If you've ever used the '*' character on the shell command line, you've used something that's similar, but not identical to, regular expressions. Here are the special characters that you can use in regular expressions:

Character Description
^ Matches the beginning of the line
$ Matches the end of the line
. Matches any single character
* Will match zero or more occurrences of the previous character
[ ] Matches all the characters inside the [ ]

Probably the best way to get your feet wet with regular expressions is to see a few examples. All of these examples will be accepted by sed as valid addresses to appear on the left side of a command. Here are a few:

Regular expression Description
/./ Will match any line that contains at least one character
/../ Will match any line that contains at least two characters
/^#/ Will match any line that begins with a '#'
/^$/ Will match all blank lines
/}$/ Will match any lines that ends with '}' (no spaces)
/} *$/ Will match any line ending with '}' followed by zero or more spaces
/[abc]/ Will match any line that contains a lowercase 'a', 'b', or 'c'
/^[abc]/ Will match any line that begins with an 'a', 'b', or 'c'

I encourage you to try several of these examples. Take some time to get familiar with regular expressions, and try a few regular expressions of your own creation. You can use a regexp this way:

$ sed -e '/regexp/d' /path/to/my/test/file | more

This will cause sed to delete any matching lines. However, it may be easier to get familiar with regular expressions by telling sed to print regexp matches, and delete non-matches, rather than the other way around. This can be done with the following command:

$ sed -n -e '/regexp/p' /path/to/my/test/file | more

Note the new '-n' option, which tells sed to not print the pattern space unless explicitly commanded to do so. You'll also notice that we've replaced the d command with the p command, which as you might guess, explicitly commands sed to print the pattern space. Voila, now only matches will be printed.

More on addresses

Up till now, we've taken a look at line addresses, line range addresses, and regexp addresses. But there are even more possibilities. We can specify two regular expressions separated by a comma, and sed will match all lines starting from the first line that matches the first regular expression, up to and including the line that matches the second regular expression. For example, the following command will print out a block of text that begins with a line containing "BEGIN", and ending with a line that contains "END":

$ sed -n -e '/BEGIN/,/END/p' /my/test/file | more

If "BEGIN" isn't found, no data will be printed. And, if "BEGIN" is found, but no "END" is found on any line below it, all subsequent lines will be printed. This happens because of sed's stream-oriented nature -- it doesn't know whether or not an "END" will appear.

C source example

If you want to print out only the main() function in a C source file, you could type:

$ sed -n -e '/main[[:space:]]*(/,/^}/p' sourcefile.c | more

This command has two regular expressions, '/main[[:space:]]*(/' and '/^}/', and one command, p. The first regular expression will match the string "main" followed by any number of spaces or tabs, followed by an open parenthesis. This should match the start of your average ANSI C main() declaration.

In this particular regular expression, we encounter the '[[:space:]]' character class. This is simply a special keyword that tells sed to match either a TAB or a space. If you wanted, instead of typing '[[:space:]]', you could have typed '[', then a literal space, then Control-V, then a literal tab and a ']' -- The Control-V tells bash that you want to insert a "real" tab rather than perform command expansion. It's clearer, especially in scripts, to use the '[[:space:]]' command class.

OK, now on to the second regexp. '/^}/' will match a '}' character that appears at the beginning of a new line. If your code is formatted nicely, this will match the closing brace of your main() function. If it's not, it won't -- one of the tricky things about performing pattern matching.

The p command does what it always does, explicitly telling sed to print out the line, since we are in '-n' quiet mode. Try running the command on a C source file -- it should output the entire main() { } block, including the initial "main()" and the closing '}'.

Next time

Now that we've touched on the basics, we'll be picking up the pace for the next two articles. If you're in the mood for some meatier sed material, be patient -- it's coming! In the meantime, you might want to check out the following sed and regular expression resources.

Resources

Next >>>

Read the next article in this series: Sed by Example, Part 2

Support Funtoo and help us grow! Donate $15 per month and get a free SSD-based Funtoo Virtual Container. 3 spots left.

About the Author

Daniel Robbins is best known as the creator of Gentoo Linux and author of many IBM developerWorks articles about Linux. Daniel currently serves as Benevolent Dictator for Life (BDFL) of Funtoo Linux. Funtoo Linux is a Gentoo-based distribution and continuation of Daniel's original Gentoo vision.

Got Funtoo?

Have you installed Funtoo Linux yet? Discover the power of a from-source meta-distribution optimized for your hardware! See our installation instructions and browse our CPU-optimized builds.

Funtoo News

Drobbins

RSS/Atom Support

You can now follow this news feed at http://www.funtoo.org/news/atom.xml .
10 February 2015 by Drobbins
Drobbins

Creating a Friendly Funtoo Culture

This news item details some recent steps that have been taken to help ensure that Funtoo is a friendly and welcoming place for our users.
2 February 2015 by Drobbins
Mgorny

CPU FLAGS X86

CPU_FLAGS_X86 are being introduced to group together USE flags managing CPU instruction sets.
31 January 2015 by Mgorny
View More News...

More Articles

Browse all our Linux-related articles, below:

A

B

F

G

K

L

M

O

P

S

T

W

X

Z