A Quick Introduction to sed(1)

The sed(1) stream editor is one of the most powerful tools from the classic Unix tool box. It is a close cousin to the ed(1) command line editor and a descendant of the ex(1) editor, the command line mode of vi(1). In this article I’ll show a few idioms that I frequently use in practice.

The basic operation of sed is very simple. It reads all input that is given on stdin or on the command line, applies a command to each line, and echos the result to stdout. There’s quite a few commands, but the most useful are for text replacement, printing, deleting, or to stop processing. You can optionally specify addresses – when addressing is used, a command is only executed on lines matching the addresses.

This is the basic syntax:

sed 'addr1,addr2 command' files...

Most people don’t specify addresses and use sed for replacing text using the substitution command “s”:

sed 's/regex/replacement/g' files...

The “g” flag tells sed to replace all occurrences on an input line, not just the first one.

Addresses can be left out completely, or you can specify just one. In the following example, we implement head(1) using the quit command “q”:

sed 10q files...

This addresses line 10 and stops processing there. Another useful example is deleting the first line in a file using the delete command “d”, for example for stripping the header from a CSV file:

sed 1d file

We can extend this by adding another address, this time deleting lines 1 through 5:

sed 1,5d file

The last line of a file is represented by the special “$” address. A less efficient way of stripping the header is by combining the “-n” flag and using the print command “p”:

sed -n '2,$' files...

Combining “-n” and “p” is used frequently to gain closer control over what gets echoed to stdout.

Instead of using numbers for addressing, sed also supports regular expressions. The following example strips the body from an email where an empty line separates header from body:

sed '/^$/q' email.txt

We address the empty line and stop processing. This is how you would strip the header, combining regexp with line addressing:

sed -n '/^$/,$p' email.txt

Extensions of classic sed, like GNU sed extend the command language, and offer more powerful regular expressions. Typically, I use sed as a pipeline component, combining it with other tools like find(1), grep(1), or awk. For more advanced use cases, I tend to write Python scripts.

However, here’s one feature of GNU sed that is both useful and somewhat dangerous – replacing a pattern inside a bunch of files:

sed -i.bak 's/regexp/replacement/g' files...

Unlike sed’s usual mode of operation, this actually edits files, replacing their contents! In this case, I tell sed to keep backups of the files it modifies (with “.bak” suffix), which is strongly encouraged. Arguably, in-place editing is an ill-fitting use case for a stream editor, but it’s extremely useful in practice.

Advertisements
This entry was posted in shell, tools and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s