Understanding the find(1) Utility

The Unix find(1) utility is a powerful tool, yet few people really understand how it works. It may be a bit confusing at first, but a programmer who knows his boolean algebra should be able to wrap his head around the basic concepts without much trouble.

You've probably seen simple find constructs like this:

find . -type f -name "*.c" -print

This prints all files (-type f) in the current working directory that end with .c (-name "*.c"). Easy. But what about this one:

find . \( -type d -name .svn -prune \) -o -print

Before you can fully understand this we'll have to cover a few basics.

Find traverses a given list of directories and evaluates a user-provided expression for each file or directory it encounters. Like many programming languages, it uses short circuit evaluation for the expression. The catch is that the result of the expression is meaningless; no action is triggered if the result is true. Instead, operands like -print or -ls are ordinary predicates that always evaluate to true! Most predicates check if a file has a specified property (like if its name matches a regular expression), but -print is only executed for its side effect: It prints the file name every time it is evaluated.

Use the following commands to check if you understood the concepts:

find . -type f -print -print
find . -print -o -print

The first command prints the names of all files twice. That's because the predicates are ANDed together. In the second command, the predicates are combined using the OR operator (-o in find syntax). Since the first -print already evaluates to true, there's no point in evaluating the second one. So the file names are only printed once.

Let's go back to the more advanced example:

find . \( -type d -name .svn -prune \) -o -print

Here we've got an expression consisting of two parts connected using the OR operator. The -print is only evaluated if the first part of the expression (the one in parenthesis) returns false.

Inside the parenthesis, there's the -prune predicate we haven't talked about yet. When -prune is evaluated, it has the side effect of skipping the whole directory tree. That means, after a -prune is evaluated, find doesn't descend into the currently evaluated directory.

So, what does the command actually do? It recursively traverses a directory tree printing all files except .svn directories and their contents.

social