Book HomePHP CookbookSearch this book

19.10. Processing All Files in a Directory

19.10.1. Problem

You want to do something to all the files in a directory and in any subdirectories.

19.10.2. Solution

Use the pc_process_dir( ) function, shown in Example 19-1, which returns a list of all files in and beneath a given directory.

Example 19-1. pc_process_dir( )

function pc_process_dir($dir_name,$max_depth = 10,$depth = 0) {
    if ($depth >= $max_depth) {
        error_log("Reached max depth $max_depth in $dir_name.");
        return false;
    }
    $subdirectories = array();
    $files = array();
    if (is_dir($dir_name) && is_readable($dir_name)) {
        $d = dir($dir_name);
        while (false !== ($f = $d->read())) {
            // skip . and .. 
            if (('.' == $f) || ('..' == $f)) {
                continue;
            }
            if (is_dir("$dir_name/$f")) {
                array_push($subdirectories,"$dir_name/$f");
            } else {
                array_push($files,"$dir_name/$f");
            }
        }
        $d->close();
        foreach ($subdirectories as $subdirectory) {
            $files = array_merge($files,pc_process_dir($subdirectory,$max_depth,$depth+1));
        }
    } 
    return $files;
}

19.10.3. Discussion

Here's an example: if /tmp contains the files a and b, as well as the directory c, and /tmp/c contains files d and e, pc_process_dir('/tmp') returns an array with elements /tmp/a, /tmp/b, /tmp/c/d, and /tmp/c/e. To perform an operation on each file, iterate through the array:

$files = pc_process_dir('/tmp');
foreach ($files as $file) {
  print "$file was last accessed at ".strftime('%c',fileatime($file))."\n";
}

Instead of returning an array of files, you can also write a function that processes them as it finds them. The pc_process_dir2( ) function, shown in Example 19-2, does this by taking an additional argument, the name of the function to call on each file found.

Example 19-2. pc_process_dir2( )

function pc_process_dir2($dir_name,$func_name,$max_depth = 10,$depth = 0) {
    if ($depth >= $max_depth) {
        error_log("Reached max depth $max_depth in $dir_name.");
        return false;
    }
    $subdirectories = array();
    $files = array();
    if (is_dir($dir_name) && is_readable($dir_name)) {
        $d = dir($dir_name);
        while (false !== ($f = $d->read())) {
            // skip . and ..
            if (('.' == $f) || ('..' == $f)) {
                continue;
            }
            if (is_dir("$dir_name/$f")) {
                array_push($subdirectories,"$dir_name/$f");
            } else {
                $func_name("$dir_name/$f");
            }
        }
        $d->close();
        foreach ($subdirectories as $subdirectory) {
            pc_process_dir2($subdirectory,$func_name,$max_depth,$depth+1);
        }
    } 
}

The pc_process_dir2( ) function doesn't return a list of directories; instead, the function $func_name is called with the file as its argument. Here's how to print out the last access times:

function printatime($file) {
    print "$file was last accessed at ".strftime('%c',fileatime($file))."\n";
}

pc_process_dir2('/tmp','printatime');

Although the two functions produce the same results, the second version uses less memory because potentially large arrays of files aren't passed around.

The pc_process_dir( ) and pc_process_dir2( ) functions use a breadth-first search . In this type of search, the functions handle all the files in the current directory; then they recurse into each subdirectory. In a depth-first search , they recurse into a subdirectory as soon as the subdirectory is found, whether or not there are files remaining in the current directory. The breadth-first search is more memory efficient; each pointer to the current directory is closed (with $d->close( )) before the function recurses into subdirectories, so there's only one directory pointer open at a time.

Because is_dir( ) returns true when passed a symbolic link that points to a directory, both versions of the function follow symbolic links as they traverse down the directory tree. If you don't want to follow links, change the line:

if (is_dir("$dir_name/$f")) {

to:

if (is_dir("$dir_name/$f") && (! is_link("$dir_name/$f"))) {

19.10.4. See Also

Recipe 6.10 for a discussion of variable functions; documentation on is_dir( ) at http://www.php.net/is-dir and is_link( ) at http://www.php.net/is-link.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.