Book HomePerl CookbookSearch this book

7.17. Caching Open Output Filehandles

Problem

You need more output files open simultaneously than your system allows.

Solution

Use the standard FileCache module:

use FileCache;
cacheout ($path);         # each time you use a filehandle
print $path "output";

Discussion

FileCache's cacheout function lets you work with more output files than your operating system lets you have open at any one time. If you use it to open an existing file that FileCache is seeing for the first time, the file is truncated to length zero, no questions asked. However, in its opening and closing of files in the background, cacheout tracks the files it has opened before and does not overwrite them, but appends to them instead. This does not create directories for you, so if you give it /usr/local/dates/merino.ewe to open but the directory /usr/local/dates doesn't exist, cacheout will die.

The cacheout() function checks the value of the C-level constant NOFILE from the standard system include file sys/param.h to determine how many concurrently open files are allowed on your system. This value can be incorrect on some systems and even missing on a few (for instance, on those where the maximum number of open file descriptors is a process resource limit that can be set with the limit or ulimit commands). If cacheout() can't get a value for NOFILE, just set $FileCache::maxopen to be four less than the correct value, or choose a reasonable number by trial and error.

Example 7.8 splits an xferlog file created by the popular wuftpd FTP server into files named after the authenticated user. The fields in xferlog files are space-separated, and the fourth from last field is the authenticated username.

Example 7.8: splitwulog

#!/usr/bin/perl
# splitwulog - split wuftpd log by authenticated user
use FileCache;
$outdir = '/var/log/ftp/by-user';
while (<>) {
    unless (defined ($user = (split)[-4])) {
       warn "Invalid line: $.\n";
       next;
    }
    $path = "$outdir/$user";
    cacheout $path;
    print $path $_;
}

See Also

Documentation for the standard FileCache module (also in Chapter 7 of Programming Perl); the open function in perlfunc (1) and in Chapter 3 of Programming Perl


Previous: 7.16. Storing Filehandles in VariablesPerl CookbookNext: 7.18. Printing to Many Filehandles Simultaneously
7.16. Storing Filehandles in VariablesBook Index7.18. Printing to Many Filehandles Simultaneously

Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.