LJ Archive

Ruby

Thomas Østerlie

Issue #95, March 2002

Thomas demonstrates the power and flexibility of Ruby.

Ruby is a full-fledged, modern, pure, object-oriented programming language. Its syntax is terse and consistent, making Ruby both easy to read and learn, and it's flexible and expressive as well. If you're coming from a background in an API-bloated language, you will be surprised by Ruby's small but powerful core API. That Ruby is tightly integrated with the underlying operating system, and that it is ridiculously simple to extend, makes it both a powerful and versatile programming language.

Bold assertions? Let's uncover the truths behind these claims. For demonstration, I have included a simple Ruby script that purges a temp directory of files older than a given number of days. The application lets me demonstrate both basic Ruby syntax and some of the language's more important features. The entire script is included in Listing 1 [available at ftp.linuxjournal.com/pub/lj/listings/issue95/4834.tgz]. It is invoked by

./purge.rb [tmp_dir] [max_file_age_in_days]

where age determines how old a file needs to be before it is purged from the temp directory. You can add a call to this script in your crontab.

Ruby Basics

Ruby, an object-oriented language, offers encapsulation of data and methods within objects, allows inheritance from one class to another and supports polymorphism. Everything, including primitive data types like strings and integers, is represented as an object. Even constants and classes are represented as objects. This makes Ruby a pure object-oriented language. The only exception here is the control structure, a handful of expressions such as for, if, while, etc. These are not objects.

As shown in Listing 2 [available at ftp.linuxjournal.com/pub/lj/listings/issue95/4834.tgz], the delete_older method contains the top-level program logic: traverse a given directory to check for files to delete.

To those used to typed languages like Java or C++, the method parameters' missing type declarations may seem strange. But Ruby is dynamically typed. That is, a variable has no type, but the object it holds a reference to does, hence the lack of types in the declaration. Dynamic typing favors object composition over class inheritance. There is no controlling the type of objects passed as parameters in method calls, alleviating the need to worry about complex inheritance hierarchies, as we no longer depend on polymorphism to pass objects into methods. This leads to simpler, more reusable code.

Ruby's method declaration should look familiar to Python programmers. The two languages declare methods in practically the same way, including the use of optional parameters. An optional parameter can be left out when calling a method. Leaving out the parameter is the same as invoking the method with the optional parameter's default value.

Ruby's method declaration also lacks a return value. Since the language is dynamically typed, there is no need to declare a return type. Unless a return object is explicitly specified with the return statement, the last expression evaluated will be returned, as in Lisp.

A method is invoked by sending the target object a message. This is the Smalltalk way. The target.message(parameterlist) message-passing notation should be familiar to all object-oriented programmers. Sending an object a message invokes the corresponding method on the target object. All inter-object communication is handled by message passing.

Ruby operates with the notion of two kinds of methods: class methods and what is simply called methods, or instance methods. Instance methods are invoked on instantiated classes, more commonly known as objects. Class methods are called on uninstantiated classes and are like static methods in Java and C++. As a class method is called on an uninstantiated class, it may be considered a library method. It does not operate on the object's member variables.

Containers

Consider the following code the script is invoked with, which processes the command-line parameters:

path = ARGV.shift or raise "Missing path to delete"
age  = ARGV.shift or raise "Missing age in days"

ARGV is an array object containing the command-line options from the invocation of the script. Calling “shift” returns and removes array's first element. Ruby has an advanced array class. The array is dynamic; it resizes itself. It is an object, so you need not worry about memory issues and walking off its end either. Methods allowing you to process the array by index, by element and as if it were a stack, a set or a queue, are also included with the class. Arrays may be reversed and they may be sorted. For table lookups, use the Hash class.

The following line from Listing 1 shows how elegant Ruby's array is:

Dir.entries(full_name) - ['.', '..']).empty?

Dir.entries(full_name) returns an array containing all files in the directory. The array ['.', '..'] is then subtracted from the directory listing by using with the - operator. We can then see if the directory is empty by calling isEmpty? on the directory listing. If the array is empty, i.e., isEmpty? returns true, no other files are left in the directory.

Error Handling

Once the invocation parameters have been processed, it is time to call delete_older:

delete_older(path, age_in_seconds)
rescue puts "Error: #$!"

Errors may occur during execution. If the script is invoked with the path to a nonexistent directory, for instance, an error will occur the first time Ruby invokes a method on the Dir class. The code above not only invokes delete_older, it also handles possible errors that occur during execution. The key here is the rescue expression. When an error occurs, the Ruby interpreter packages the error in an exception object. This object propagates back up the call stack until it reaches some code that explicitly declares it knows how to handle this particular type of exception. Exceptions that are never caught propagate through the call stack, ending up with an abnormal program termination; the stack trace is printed to stderr. This is opposed to returning error codes like shell scripts and C do, leading to less-nested statements, less spaghetti logic and simply better error handling.

Including an ensure statement in connection with the rescue expression ensures that a code block is run no matter what else happens. Combine this with the possibility of writing your own exceptions, making your own code throw exceptions (with the raise expression as shown in the program listing in Containers), and the ability to actually recover from an exception by running some code and retrying the code that caused the failure, and you have one of the neatest error-handling mechanisms I've ever used.

Advanced Features

Let's return to delete_older to look at some of Ruby's more advanced features (see Listing 2 [available at ftp.linuxjournal.com/pub/lj/listings/issue95/4834.tgz]). Line two sees “foreach” being invoked on the Dir class; foreach is an implementation of the iterator design pattern. If you are doing object-oriented programming, but have not read the Gang of Four's groundbreaking book Design Patterns, you'd better run out and buy a copy. Iterator is not the only pattern implemented in core Ruby. Singleton, publisher/subscriber, visitor and delegation patterns also are implemented. Other patterns also can be implemented simply if required, but the listed patterns are shipped with Ruby.

foreach iterates over the files in a directory. Following the call to Dir's foreach is a block of code with a start and end very much resembling that of a regular Java or C++ code block. The code contained within the curly braces is called a block, which is like a method within a method. A block is never executed at the time it is encountered. Whenever the foreach method has read a single file from the directory, it yields control to the block. The code within the block is executed, and control returns to foreach, which reads a new file from the directory repeating the procedure over again until no more files are left to iterate over.

Instead of having to write helper classes to make iterators work, as you have to in Java or C++, Ruby includes the yield expression that makes it possible to implement iterators as methods. This is a prime example of the language's expressiveness and flexibility. Instead of writing the scaffolding to make it happen, Ruby lets you concentrate on doing the job.

As mentioned earlier, a method is invoked by sending a message to the target object on the target.message form. Only methods local to the class may be called without specifying a target. My script calls “puts”, which is a method belonging to the kernel, without specifying any target. How does the interpreter know which method I'm calling when puts is not local to the object and no target is specified?

It is the magic of mix-ins. Mix-ins basically allow you to mix methods implemented elsewhere into a class without the use of inheritance (for more on mix-ins, please refer to the article “Using Mix-ins with Python”, Linux Journal, April 2001). Mix-ins aren't a new idea, nor is Ruby the only language to support it. But it is most definitely one of the features that gives Ruby that nice and clean syntax.

I could never hope to deal with all of Ruby's features in this article. Instead I'll refer you to David Thomas and Andrew Hunt's book Programming Ruby for more details on issues like modules, aliasing, namespaces, reflection, dynamical method calls, system hooks, program distribution and networking. It is worth mentioning that Ruby also supports regular expressions that are just as good as Perl's and supports CGI, in addition to having its own Apache module, mod_ruby.

Conclusion

Is Ruby yet another scripting language? No, it is not. It is something more, something new and exciting coming out of the Japanese open-source scene. Ruby is the programmer's best friend. While Ruby is presented as a scripting language, it has proven equally suited for large projects. It includes some exciting features that other alternative languages are only beginning to implement. Ruby is therefore well worth checking out.

Acknowledgements

A special thanks goes out to my technical reviewers: Armin Roehrl, of approximity.com, for reviewing the draft manuscript and guidance in editing the final version. David Thomas, of pragmaticprogrammer.com, for massively improving the original sample script and reviewing the draft manuscript. Kent Dahl and Sean Chittenden for reviewing the draft manuscript. Last, but not least, Magnus Lie Hetland, Python guru, for invaluable assistance.

Resources

Thomas Østerlie is a consultant with Norwegian-based consulting company ConsultIT A/S, where he works with server-side systems development for UNIX platforms and with computer security. He has been an avid Linux user since 1995, after having been forced to install Windows 95 on his office computer. He can be reached at thomas.osterlie@consultit.no.

LJ Archive