Project15.Find Files by Name


Project 15. Find Files by Name

"How do I search the file system a file named my lost letter.txt?"

This project shows you how to search the file system for specific files based on filename and pathname. It uses the locate command to find public files quickly and the find command to find any file. See Projects 17 and 18 when you need to employ more advanced file-finding techniques. Project 20 gives some handy find tips.

Locate with locate

The locate command searches the file system for files of a given name. Here are the results we get locating the file httpd.conf.

$ locate httpd.conf /private/etc/httpd/httpd.conf /private/etc/httpd/httpd.conf.applesaved /private/etc/httpd/httpd.conf.bak /private/etc/httpd/httpd.conf.default


(If you don't get any results, see "The Locate Database" a little later in this project.) To confine the search to named directories, we include a partial pathname. The following example locates files named system.log in any directory named log.

$ locate log/system.log /private/var/log/system.log /private/var/log/system.log.0.gz /private/var/log/system.log.1.gz


From the examples above, you'll notice that

  • The locate command matches against the entire pathname, not just the filename. You can search for files, directories, and pathnames.

  • Pathnames match if they contain the search term; they don't have to equal it. To use shell globbing terminology, the search term is effectively *search-term*.

  • The locate command is fast (although it's not apparent from reading this book ).

The Locate Database

The locate command is fast because it doesn't search the file system; instead, it searches a prebuilt database of all public files. Public files are those for which "others" are granted read permission, ensuring that locate won't reveal the names (or the existence) of your private files. To be completely accurate, locate reveals only files that are accessible to the user nobody; see the sidebar "User Nobody" for more details.

The locate command will not find my private file template.rtf.

$ ls Documents/Letters template.rtf $ locate template.rtf <no results - this is to be expected>


So how is the locate database built in the first place? It's (re)built automatically as part of weekly maintenance during the early hours of Saturday. If you've recently added new public files, you may have to rebuild the locate database manually. The best way to do this is to run the weekly maintenance scripts by typing

$ sudo periodic weekly


User Nobody

The user nobody is a special user with a UID of -2. It has a primary group called nobody that has a GID of -2. This user is an unprivileged user who does not (usually) own any files and is restricted to the permissions assigned to others. Refer to Projects 7 and 8 for more information on users, groups, and permissions.

$ id nobody uid=4294967294(nobody) gid=4294967294(nobody) groups=4294967294(nobody)


If you are curious, or even if you are not, 4294967294 is also -2 interpreted as a 32-bit unsigned number.


Use Globbing with locate

The locate command lets you specify a filename pattern as the search term, using the same pattern-matching operators as for shell globbingnamely, [^*?]. In fact, locate automatically converts a bare search term such as httpd.conf to *httpd.conf*. If you use any pattern-matching operator in your search term, you override this default behavior, and no star characters are added.

The next example locates all the Unix man-page directories. A search term such as */man/man? matches pathnames .../man/man1, .../man/man2, and so on. The query character must be escaped from the shell; otherwise, it will be expanded before it is passed to locate. Also, because the search term now includes a pattern-matching operator in the query symbol, a star is not implicitly added to the start and so must be added explicitly (and escaped from the shell).

$ locate '*/man/man?'


Tip

To avoid automatic globbing when searching for a filename such as httpd.conf, use the search term */httpd.conf.


Rebuild Manually

You may build the locate database manually and without having to run periodic maintenance scripts. It's not that easy, though; in particular, you must do it as the user nobody to avoid revealing the names of private files. Here are step-by-step instructions for doing so. Even if you don't need to perform a manual rebuild, the procedure is interesting, as it includes some clever "geekery."

Learn More

Refer to Project 11 to learn about globbing.


First, become nobody (not as sad as it sounds). This is not so simple, because the user nobody does not have a password. In fact, the password is not even nothing. It has a hash value of star, for which there is no possible plain-text password.

$ su nobody Password: su: Sorry


The trick is to become root (the ultimate Somebody). Root can become any other user without the need for a password, neatly sidestepping the problem.

$ sudo -s Password: #


Periodic Maintenance

Your Mac runs daily, weekly, and monthly maintenance scripts in the early hours. If your Mac isn't running at that time, the scripts will be run later, but only in Mac OS X 10.4 (Tiger). Before Tiger, missed schedules were simply skipped.


Let's rebuild the database now. First, we must ensure that the database exists by touching it (which has the effect of creating it, if it does not exist).

# touch /var/db/locate.database


Learn More

See Project 71 to learn more about periodic maintenance, and Project 70 to learn more about execution of scheduled commands.


Then we make it writable by user nobody by making the owner nobody and giving the owner write permission.

# chown nobody /var/db/locate.database # chmod 644 /var/db/locate.database


Learn More

Refer to Projects 7 and 8 to learn about users, groups, and permissions.


Now switch to user nobody. This doesn't seem to do anything. Why? Because user nobody is not permitted to log in.

# su nobody # whoami root


Learn More

Refer to Project 6 if you need to brush up on redirection and pipelining.


The trick is to use option -m, which switches users without going through the normal login process:.

# su -m nobody $ whoami nobody


Now we can run the update script, located at /usr/libexec/locate.updatedb. It's wise to throw away "Permission denied" errors ; there'll be lots of them, for obvious reasons. To do this, we pipe standard error to the grep command and filter out the appropriate lines.

$ /usr/libexec/locate.updatedb 2>&1 | ¬     grep -v "Permission denied"


Learn More

See Project 23 to learn more about the grep command.


Finally, we tighten the permissions and exit from users nobody and then root.

$ chmod 444 /var/db/locate.database $ exit exit # exit exit


Find with find

Unlike the locate command, which searches a prebuilt database, the find command does a live search of the file system. The find command is very powerful and able to search for files based on many criteria. This project shows searches based only on filename and pathname.

Learn More

Projects 17 and 18 cover the advanced features of find, showing how it can be combined with other commands to process a list of files or the files themselves.


Find by Name

Let's find all files named letter.txt located in (rooted in) our home directory. The find command is inherently recursive, so you need to specify only the root of the search. The command requires a search root as its first argument; it can't be omitted, because find won't assume the current directory. To search by name, specify the primary -name and follow it with the search term. For example:

$ find ~ -name letter.txt /Users/saruman/Documents/Letters/letter.txt ...


Primaries

A primary introduces a new search criterion. The find command uses primaries combined with operators such as AND and OR to build complex criteria specifying which files should match and which should not (Projects 17 and 18 cover primaries and operators in greater depth.)

Read the man page for find for a full list of primaries and operators.


To search the current directory, simply replace tilde with dot. To search the entire file system, use forward slash and the sudo command. A search of the entire file system can take a while to complete.

$ sudo find / -name letter.txt Password: /Users/saruman/Documents/Letters/letter.txt ...


The find command uses a case-sensitive comparison against the search term. To ignore case, use the primary -iname.

Tip

If you get bored waiting for find to complete and want to do something more interesting, simply press Control-c to abort it.


Use Globbing with find

You may specify a filename pattern as the search term by using the same pattern-matching operators as in shell globbingnamely, [^*?]. Unlike with the locate command, a bare search term like httpd.conf is not automatically converted to *httpd.conf*.

The next example finds all .txt files rooted in the current directory. The star character must be escaped from the shell; otherwise, it will be expanded before it is passed to find.

$ find . -iname "*.txt" ./Letters/letter.txt ...


Running find in the Background

You may want to run a lengthy find operation in a new Terminal window or in the background. The example below demonstrates a search of the root disk run in the background (the -b option applied to sudo does this). Output is redirected to the file search.out, and the whole command is niced (run at a lower priority) to avoid hogging resources.

$ nice -n 10 sudo -b ¬     find / -name ¬     "search-term" &> ¬     ~/search.out



Search by Pathnames

Use the primary -path (or -ipath, to be case insensitive) to search against the whole pathname, not just the filename. Compare the following two examples. The first finds all files named test anywhere. The second finds files named test rooted in directories named test.

First, match filenames. Only pathnames ending in test will match.

$ find . -iname test ./test ./test/test ./Trial/One/Test ./Trial2/test ./Trial2/version1/test ./Trial2/version1/test/a/test


Second, match pathnames. Any pathname containing the sequence /test/ and ending test will match.

$ find . -ipath "*/test/*test" ./test/anewtest ./test/test ./Trial2/version1/test/a/test


Tip

Combine many primaries to suit your search requirements. In particular, primaries can be combined with AND and OR. See Project 17 to learn more about complex conditions.


The second example fails on a file called anewtest. We wanted to match only files named exactly test. The easiest way to correct this is to combine primaries. Use -ipath to match pathnames and -iname to match filenames. Both primaries must match. Because anewtest does not match the filename test, it is eliminated from the search results.

$ find . -ipath "*/test/*" -iname test ./test/test ./Trial2/version1/test/a/test


Use Regular Expressions

You may search pathnames by employing regular expressions instead of the pattern-matching operators used in shell globbing. Use the primary -regex (or -iregex, to be case insensitive) to search against the pathname. Notice the use of the regular expression operator dot-star versus the shell globbing operator star from previous examples.

$ find . -iregex ".*/test/.*test" ./test/test ./Trial2/version1/test/a/test


Learn More

Projects 77 and 78 cover regular expressions in detail.


This simple example is equivalent to using the primary ipath. Regular expressions are more powerful than globbing, however, so you may need to consider this alternative for more complex requirements. If you're familiar with regular expressions, take note that matching is against the whole pathname. Without the leading dot-star, the file ./TRial2/version1/test/a/test would not match.

Option -E enables extended regular expressions, telling the primary -regex to interpret its argument as an extended regular expression. Without this option, -regex would assume a basic regular expression.




Mac OS X UNIX 101 Byte-Sized Projects
Mac OS X Unix 101 Byte-Sized Projects
ISBN: 0321374118
EAN: 2147483647
Year: 2003
Pages: 153
Authors: Adrian Mayo

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net