Book HomeLearning Perl, 3rd EditionSearch this book

13.8. Using Simple Modules

Suppose that you've got a long filename like /usr/local/bin/perl in your program, and you need to find out the basename. That's easy enough, since the basename is everything after the last slash (it's just "perl" in this case):

my $name = "/usr/local/bin/perl";
(my $basename = $name) =~ s#.*/##;  # Oops!

As we saw earlier, first Perl will do the assignment inside the parentheses, then it will do the substitution. The substitution is supposed to replace any string ending with a slash (that is, the directory name portion) with an empty string, leaving just the basename.

And if you try this, it seems to work. Well, it seems to, but actually, there are three problems.

First, a Unix file or directory name could contain a newline character. (It's not something that's likely to happen by accident, but it's permitted.) So, since the regular expression dot (".") can't match a newline, a filename like the string "/home/fred/flintstone\n/brontosaurus" won't work right -- that code would think the basename is "flintstone\n/brontosaurus". You could fix that with the /s option to the pattern (if you remembered about this subtle and infrequent case), making the substitution look like this: s#.*/##s

The second problem is that this is Unix-specific. It assumes that the forward slash will always be the directory separator, as it is on Unix, and not the backslash or colon that some systems use.

And the third (and biggest) problem with this is that we're trying to solve a problem that someone else has already solved. Perl comes with a number of modules, which are smart extensions to Perl that add to its functionality. And if those aren't enough, there are many other useful modules available on CPAN, with new ones being added every week. You (or, better yet, your system administrator) can install them if you need their functionality.

In the rest of this section, we'll show you how to use some of the features of a couple of simple modules that come with Perl. (There's more that these modules can do; this is just an overview to illustrate the general principles of how to use a simple module.)

Alas, we can't show you everything you'd need to know about using modules in general, since you'd have to understand advanced topics like references and objects in order to use some modules.[298] But this section should prepare you for using many simple modules. Further information on some interesting and useful modules is included in Appendix B, "Beyond the Llama".

[298]As we'll see in the next few pages, though, you may be able to use a module that uses objects and references without having to understand those advanced topics.

13.8.1. The File::Basename Module

In the previous example, we found the basename of a filename in a way that's not portable. We showed that something that seemed straightforward was susceptible to subtle mistaken assumptions (here, the assumption was that newlines would never appear in file or directory names). And we were re-inventing the wheel, solving a problem that others have solved (and debugged) many times before us.

Here's a better way to extract the basename of a filename. Perl comes with a module called File::Basename. With the command perldoc File::Basename, or with your system's documentation system, you can read about what it does. That's the first step when using a new module. (It's often the third and fifth step, as well.)

Soon you're ready to use it, so you declare it with a use directive near the top of your program:[299]

[299]It's traditional to declare modules near the top of the file, since that makes it easy for the maintenance programmer to see which modules you'll be using. That greatly simplifies matters when it's time to install your program on a new machine, for example.

use File::Basename;

During compilation, Perl sees that line and loads up the module. Now, it's as if Perl has some new functions that you may use in the remainder of your program.[300] The one we wanted in the earlier example is the basename function itself:

[300]You guessed it: there's more to the story, having to do with packages and fully qualified names. When your programs are growing beyond a few hundred lines in the main program (not counting code in modules), which is quite large in Perl, you should probably read up about these advanced features. Start with the perlmod manpage.

my $name = "/usr/local/bin/perl";
my $basename = basename $name;  # gives 'perl'

Well, that worked for Unix. What if our program were running on MacPerl or Windows or VMS, to name a few? There's no problem -- this module can tell which kind of machine you're using, and it uses that machine's filename rules by default. (Of course, you'd have that machine's kind of filename string in $name, in that case.)

There are some related functions also provided by this module. One is the dirname function, which pulls the directory name from a full filename. The module also lets you separate a filename from its extension, or change the default set of filename rules.[301]

[301]You might need to change the filename rules if you were trying to work with a Unix machine's filenames from a Windows machine -- perhaps while sending commands over an FTP connection, for example.

13.8.2. Using Only Some Functions from a Module

Suppose you discovered that when you went to add the File::Basename module to your existing program, you already have a subroutine called &dirname -- that is, you already have a subroutine with the same name as one of the module's functions.[302] Now there's trouble, because the new dirname is also implemented as a Perl subroutine (inside the module). What do you do?

[302]Well, it's not likely that you would already have a &dirname subroutine that you use for another purpose, but this is just an example. Some modules offer hundreds (really!) of new functions, making a name collision that much more frequent.

Simply give File::Basename, in your use declaration, an import list showing exactly which function names it should give you, and it'll supply those and no others. Here, we'll get nothing but basename:

use File::Basename qw/ basename /;

And here, we'll ask for no new functions at all:

use File::Basename qw/ /;

Why would you want to do that? Well, this directive tells Perl to load up File::Basename, just as before, but not to import any function names. Importing lets us use the short, simple function names like basename and dirname. But even if we don't import those names, we can still use the functions. When they're not imported, though, we have to call them by their full names:

use File::Basename qw/ /;                     # import no function names

my $betty = &dirname($wilma);                 # uses our own subroutine &dirname (not shown)

my $name = "/usr/local/bin/perl";
my $dirname = File::Basename::dirname $name;  # dirname from the module

As you see, the full name of the dirname function from the module is File::Basename::dirname. We can always use the function's full name (once we've loaded the module) whether we've imported the short name dirname or not.

Most of the time, you'll want to use a module's default import list. But you can always override that with a list of your own, if you want to leave out some of the default items. Another reason to supply your own list would be if you wanted to import some function not on the default list, since most modules include some (infrequently needed) functions that are not on the default import list.

As you'd guess, some modules will, by default, import more symbols than others. Each module's documentation should make it clear which symbols it imports, if any, but you are always free to override the default import list by specifying one of your own, just as we did with File::Basename. Supplying an empty list imports no symbols.

13.8.3. The File::Spec Module

Now you can find out a file's basename. That's useful, but you'll often want to put that together with a directory name to get a full filename. For example, here we want to take a filename like /home/rootbeer/ice-2.1.txt and add a prefix to the basename:

use File::Basename;

print "Please enter a filename: ";
chomp(my $old_name = <STDIN>);

my $dirname = dirname $old_name;
my $basename = basename $old_name;

$basename =~ s/^/not/;  # Add a prefix to the basename
my $new_name = "$dirname/$basename";

rename($old_name, $new_name)
  or warn "Can't rename '$old_name' to '$new_name': $!";

Do you see the problem here? Once again, we're making the assumption that filenames will follow the Unix conventions and use a forward slash between the directory name and the basename. Fortunately, Perl comes with a module to help with this problem, too.

The File::Spec module is used for manipulating file specifications, which are the names of files, directories, and the other things that are stored on filesystems. Like File::Basename, it understands what kind of system it's running on, and it chooses the right set of rules every time. But unlike File::Basename, File::Spec is an object-oriented (often abbreviated "OO") module.

If you've never caught the fever of OO, don't let that bother you. If you understand objects, that's great; you can use this OO module. If you don't understand objects, that's okay, too. You just type the symbols as we show you, and it works just as if you knew what you were doing.

In this case, we learn from reading the documentation for File::Spec that we want to use a method called catfile. What's a method? It's just a different kind of function, as far as we're concerned here. The difference is that you'll always call the methods from File::Spec with their full names, like this:

use File::Spec;

.
.  # Get the values for $dirname and $basename as above
.

my $new_name = File::Spec->catfile($dirname, $basename);

rename($old_name, $new_name)
  or warn "Can't rename '$old_name' to '$new_name': $!";

As you can see, the full name of a method is the name of the module (called a class, here), a small arrow, and the short name of the method. It is important to use the small arrow, rather than the double-colon that we used with File::Basename.

Since we're calling the method by its full name, though, what symbols does the module import? None of them. That's normal for OO modules. So you don't have to worry about having a subroutine with the same name as one of the many methods of File::Spec.

Should you bother using modules like these? It's up to you, as always. If you're sure your program will never be run anywhere but on a Unix machine, say, and you're sure you completely understand the rules for filenames on Unix,[303] then you may prefer to hardcode your assumptions into your programs. But these modules give you an easy way to make your programs more robust in less time -- and more portable at no extra charge.

[303]If you didn't know that filenames and directory names could contain newline characters, as we mentioned earlier in this section, then you don't know all the rules, do you?



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.