find is a long-standing UNIX® utility. Its role is to recursively scan one or more directories and find files which match a certain set of criteria in those directories. Even though it is very useful, the syntax is truly obscure, and using it requires a little practice. The general syntax is:
find [options] [directories] [criterion1] ... [criterionN] [action]
If you do not specify any directory, find will search the current directory. If you do not specify criteria, this is equivalent to “true”, thus all files will be found. The options, criteria and actions are so numerous that we will only mention a few of each here. Here are some options:
-xdev: do not search on
directories located on other file systems.
-mindepth
<n>: descend at least n
levels below the specified directory before searching for
files.
-maxdepth
<n>: search for files which are located at most
n levels below the specified directory.
-follow: follow symbolic
links if they link to directories. By default, find does
not follow links.
-daystart: when using
tests related to time (see below), take the beginning of current day
as a time stamp instead of the default (24 hours before current
time).
A criteria may be one or more of several atomic tests. Some useful tests are:
-type
<file_type>: search for a given type of file.
file_type can be one of: f
(regular file), d (directory), l
(symbolic link), s (socket), b
(block mode file), c (character mode file) or
p (named pipe).
-name
<pattern>: find files whose names match the
given pattern. With this option, the pattern is treated as a
shell globbing pattern
(see Section 3, “Shell Globbing Patterns”).
-atime
<n>, -amin <n>: find
files which have last been accessed n days
ago (-atime) or n minutes
ago (-amin). You can also specify
<+n> or <-n>, in
which case the search will be done for files accessed at most
or at least n days/minutes ago.
-anewer
<a_file>: find files which have been accessed
more recently than file a_file.
-ctime
<n>, -cmin <n>,
-cnewer <file>: same as for
-atime, -amin and
-anewer, but applies to the last time that the
contents of the file were modified.
-regex
<pattern>: same as -name, but
pattern is treated as a regular expression.
There are many other tests, refer to find(1) for more details. To combine tests, you can use one of:
<c1>
-a <c2>: true if both c1 and
c2 are true; -a is
implicit, therefore you can type <c1> <c2>
<c3> if you want all c1,
c2 and c3 tests to
match.
<c1>
-o <c2>: true if either c1
or c2 are true, or both. Note that
-o has a lower precedence than
-a, therefore if you want to match files
which match criteria c1 or
c2 and also match criterion
c3, you will have to use parentheses and
write ( <c1> -o <c2> ) -a
<c3>. You must escape (deactivate)
parentheses, as otherwise they will be interpreted by the
shell!
-not
<c1>: inverts test c1,
therefore -not <c1> is true if
c1 is false.
Finally, you can specify an action for each file found. The most frequently used are:
-print: just prints the
name of each file on the standard output. This is the default
action.
-ls: prints on the
standard output the equivalent of ls -ilds for each
file found.
-exec
<command_line>: executes command
command_line on each file found. The
command line command_line must end with a
;, which you must escape so that the shell
does not interpret it. The file position is marked with
{}. See the usage
examples.
-ok
<command>: same as -exec but
asks for confirmation for each command.
The best way to consolidate all of
the options and parameters is with some examples. We want to find
all directories in the /usr/share
directory. We would type:
find /usr/share -type d
Suppose you have an HTTP
server. All your HTML files are in
/var/www/html, which is also your current
directory. You want to find all files whose contents have not been
modified for a month. Because you have pages from several writers,
some files have the html extension and some
have the htm extension. You want to link
these files in the /var/www/obsolete
directory. You would type[27]:
find \( -name "*.htm" -o -name "*.html" \) -a -ctime -30 \
-exec ln {} /var/www/obsolete \;This is a fairly complex example, and requires a little explanation. The criterion is this:
\( -name "*.htm" -o -name "*.html" \) -a -ctime -30
which does what we want: it finds all files
whose names end either in .htm or
.html (“ \( -name
"*.htm" -o -name
"*.html" \)”),
and (-a) which have not been
modified in the last 30 days, which is roughly a month (-ctime
-30). Note the parentheses: they are necessary here, because
-a has a higher precedence. If there weren't any, all
files ending with .htm would have been found, plus
all files ending with .html and which haven't been
modified for a month, which is not what we want. Also note that
parentheses are escaped from the shell: if we had put
( .. ) instead of
\( .. \), the shell would have
interpreted them and tried to execute -name
"*.htm" -o -name "*.html" in
a sub-shell... Another solution would have been to put parentheses between
double quotes or single quotes, but a backslash here is preferable as we
only have to isolate one character.
And finally, there is the command to be executed for each file:
-exec ln {} /var/www/obsolete \;Here too you have to escape the
; character from the shell. Otherwise
the shell would interpret it as a command separator. If you happen
to forget, find will complain that -exec
is missing an argument.
A last example: you have a huge
directory (/shared/images) containing all
kinds of images. You regularly use the touch command to
update the times of a file named stamp in
this directory, so that you have a time reference. You want to
find all JPEG images which are newer than
the stamp file, but because you got the
images from various sources, these files have extensions
jpg, jpeg,
JPG or JPEG. You also
want to avoid searching in the old
directory. You want this file list to be mailed to you, and your
user name is peter:
find /shared/images -cnewer \
/shared/images/stamp \
-a -iregex ".*\.jpe?g" \
-a -not -regex ".*/old/.*" \
| mail peter -s "New images"Of course, this command is not very useful if you have to type it each time, and you would like it to be executed regularly. A simple way to have the command run periodically is to use the cron daemon as shown in the next section.
[27] Note that this example requires that
/var/www and
/var/www/obsolete be on the same file
system!