LJ Archive

Drupal Is a Framework: Why Everyone Needs to Understand This

Diana Montalion Dupuis

Issue #888, April 2068

Everyone planning and building Web solutions with Drupal benefits from understanding what a “hook” is—and why Drupal is not a CMS.

One of the greatest challenges that Drupal adopters face, whether they are new site owners or beginning developers, is figuring out what is easy and what is hard to do with Drupal. As a developer, solution architect, technical strategist and even as the friend who knows stuff about Web sites, 60% of my discussions revolve around three questions: how long will it take, how much will it cost, and can my site do [insert cool new thing]?

Sometimes, these are easy questions to answer. Many content-related tasks can be accomplished simply by logging in to Drupal, visiting the /admin page and clicking on menu links until you land on the necessary administration page.

More often though, there are complicated questions to answer. Some tasks can be accomplished by adding contributed modules that easily “plug in” to Drupal core, as it comes “out of the box”, and expand a site's functionality. Contributed modules are created and shared by the Drupal community and can be added to any Drupal site.

Some tasks require writing custom code, and new modules must be built. Layers of potential functionality are involved in custom features. Some features require communicating back and forth with other sites via an application programming interface (API). Bigger Web sites often require the creation of small applications that accomplish tasks in the background, outside Drupal's usual workflow. In many cases, multiple solutions exist, and choosing one involves giving something up to get something else. As a developer or a stakeholder, finding the best solution that meets business goals and stays in scope depends upon cooperative discussions.

That is where communication often breaks down. Developers are speaking one language while site owners, project and account managers, stakeholders and others involved in the decision-making process speak another language. When people first learn about Drupal, their initiation often focuses on what a node is, what blocks, content types and views are, and how to create SEO-friendly URLs. These concepts are important, but they frequently fail to answer the essential “how hard is this to do” question or provide a strong foundation for collaborative planning of more complex functionality. Everyone involved needs to understand that they can architect a Drupal site that offers a more-sophisticated set of features than a WordPress site, because Drupal is not a content management system (CMS); it is a content management framework.

Conceptualizing Drupal as a framework does not require years of programming experience; rather, it simply requires understanding what a “hook” is and finding out whether the one you need exists and already is able to do the thing you want done.

To understand hooks, it's necessary to understand how dynamic Web pages, delivered by Web applications, differ from static pages. Most tech-savvy people take this knowledge for granted, especially Linux aficionados and those whose first desktop computer had a flashing cursor at a C: prompt. But many people don't know how Web sites do what they do. (Why would they?) Here is how I explain the difference in layman's terms.

In the olden days, static pages were single text documents containing everything you saw on the page, except for images, in one text file. The file included HTML tags describing the type of content being displayed—for example, <p> denotes a paragraph, and <h1> is a big headline. Browsers (which took ten hours to download) translated this markup and presented pages with a readable structure at a Web address dictated by the filename. The document would be uploaded to the server and saved in the Web site's primary folder. The filename page.html then could be viewed using the browser at yoursitename.com/page.html.

If you wanted to change the Web page's content, you edited that file. If you wanted to change something in the header that appeared on all of the site's pages, you had to edit every page. Whether linking content together or displaying a similar sidebar, content was laid out individually on each and every page by hand.

Nowadays, most sites are dynamic. Small programs, called Web applications, are uploaded and stored on the server. Instead of delivering a static page to view, the program runs when the browser lands on the page, applying logic to the page creation process. This logic dictates how the page is built each time a page is requested (also called “on page load”). For example: the program gets the header, gets the main menu, gets the page's unique content, gets the footer and delivers the whole page to the browser. As a result, now there can be one editable header, one footer and one menu shared among all Web pages.

What about the page's unique content? How does the application “get” that? Imagine a spreadsheet where each row represents each page's unique content. Dynamic Web sites store content in this way. They use a database, which can be imagined as a collection of spreadsheets, called tables. Each table, like a spreadsheet, has columns and rows. Each row has a unique ID. When a page is displayed, the content associated with that page—an article about container gardening, for example—is retrieved from the database table and output to the page.

In Drupal's case, the programming language PHP supplies the logic and MySQL provides the database. Usually, the operating system installed on the server to power this process is Linux, and Apache is the software that handles the requests for pages and delivers them once they are built. This software bundle is called the LAMP stack.

Without static filenames like about.html, how does a dynamic Web site know which row from the content table to display? Drupal, like other Web applications, uses a query string to match the content to the page address. Query strings look like this: ?q=1234, and they are attached to the end of the URL—for example, yoursitename.com/?q=1234. Drupal uses a modified (no less mystifying) address structure: yoursitename.com/node/1234. In both cases, the unique ID, the row number of the page's content, is there: 1234.

Web pages displaying semantic URLs, like yoursitename.com/growing-a-container-garden, have included logic that pairs the unique ID with the words. But for each page, a unique ID still exists and is associated with the content in a database table.

With the advent of dynamic Web applications, the continual development of the programming languages and databases needed to drive them, and the world's voracious need for more and more content-rich sites, voilà—the Content Management System (CMS) was born. Drupal is a CMS insofar as it is an application that saves content to a database and displays it to a page using logic that is written into its core or added by programmers. But Drupal is not (really) a CMS; it is a framework that does “CMSey” stuff. Drupal provides the structure for Web applications, far more complex than a CMS, that do all the things Web sites can do: expand the functionality (using contributed or custom code), communicate with other Web applications, run applications written in PHP and other languages behind the scenes, provide responsive pages or integrate front-end languages, scale to handle large traffic numbers by making use of server technologies and provide the foundation for other as-yet-unthought-of innovations.

Here's where the process gets ingenious. But, there is one more conceptual step to take before it's clear that hard or easy depends on hooks—bootstrapping. Again, this is a concept that may seem like common knowledge to the tech-focused reader, but it can be tongue-twisting to explain. Here is my layman's version, which is an oversimplification, but a deeper understanding isn't a prerequisite to understanding hooks.

When a browser hits a Web page, Drupal asks a series of questions. The question process is called bootstrapping. The questions (Q) trigger actions (A).

  • Q: Who are you (generally) and what do you want? A: Initialize and store general info.

  • Q: Can I just give you a stored copy? A: Serve cached data (content stored in memory).

  • Q: Can I connect to the database? A: Do so or die.

  • Q: Do I need anything from there to work? A: Get it.

  • Q: Who are you (specifically)? A: Start a session.

  • Q: What are your requirements? A: Create server/browser page headers (the parameters for further relating).

  • Q: Where are you? A: Select language.

Finally, Drupal delivers the content:

  • Q: Which page? A: Serve up the page.

This is the sweet spot, the place where most (but not all) of the hook magic happens.

Hooks are little blocks of functionality, called functions, that contain PHP code. These blocks of code run when they are called upon. During the bootstrapping process, especially when the final “which page?” question is asked, hooks are called. Whenever an event happens in Drupal, like deleting a page, hooks are called. Inside those hooks, there is code that alters functionality, and it runs as soon as the hook is called. Almost anything you want Drupal to do has a hook doing it.

Drupal relies on naming conventions to call hooks when the time is right for them to run. While building the menu, Drupal looks for hooks with “_menu” in the name. When a page is deleted, hooks with the name “_delete” are called.

Drupal modules override existing hooks or add new ones. For example, if I want to change the way a form is displayed, I put the code for that change inside a function called mymodulename_form_alter. When the form's page is built, Drupal will look for any “_form_alter” function to see if there is more to do. I also can create new hooks in custom code that can be called by other hooks, mymodulename_myhook.

Hooks don't just govern behavior. The theme, which is the collection of files specifically dictating the site's look and feel, also includes hooks. The front end of a Drupal site (the presentation rather than the behavior) is not simply painted on; it relies on hooks as well, all being called when Drupal delivers a page.

Remember our three original questions: how long will this take, how much will that cost, and can my site do [insert cool new thing]? The answers, and whether something is easy (quick, cheap and already possible) or hard (time consuming, expensive and innovative), depend on hooks. “I would like my site to do X. Is that easy or hard?” The scale from easy to hard looks like this:

  • Drupal already does what you want it to do because the necessary hooks, with the necessary code, run by default.

  • Drupal provides an administrative interface for you to turn it on or change it.

  • A module or theme already has been written, calling or adding the hooks (with the necessary code inside them) that you need.

  • Custom code must be written (using or creating hooks and adding code to them). The time and effort required here varies widely, from three quickly written lines of code to months of programming, creating multiple contributable modules.

  • Custom database tables must be created. At this level of complexity, the code still will rely on hooks but begins to run outside of what Drupal does natively; therefore, it is (sometimes) more complex than adding code alone.

  • Necessary data comes from other Web sites, or your site's new feature requires communicating with other sites (for example, credit-card processing). The time and effort to do this also varies widely and can be as easy as adding a module (that already handles this communicating) or as hard as writing a separate application that runs when the appropriate hook is called. What your site will do with the data and the load it puts on the system greatly influences the complexity as well.

  • Your tasks can't run on page load, a special process has to be written to accomplish them. Sometimes, this is a quick addition (a simple cron job using hook_cron), and sometimes this is complicated. Often, this approach is used when data processing would slow page load down (or take down the site), so it is handled out of sync and saved (cached), serving the cached version when the page loads and the question is asked.

Does Drupal already include the necessary hooks running the necessary code and does it provide an admin interface to set up what you want to accomplish? Easy! Do you need to get mega-amounts of data from elsewhere, process and save it out of sync with page load, and create new database tables that interact with existing data? Hard!

Drupal core and many contributed modules are primarily designed to manage content, to power a CMS—which is why it is right to say, from one point of view, that Drupal is a CMS. Out of the box, users can create any content type imaginable—book reviews, recipes, scholarly paper submissions, press releases, blog posts and so on. An admin interface in Drupal 7 makes creating nodes (the foundational content type with a title and body) and adding fields of related data to them, like the author and publisher in a book review, a code-free task. Creating book reviews that include a cover image, author, publisher, publication date and a link to Powell's City of Books is quick. Adding a five-star rating to each review involves adding one contributed module and turning it on.

How hard it is to make the review look like the design depends on how much the design varies from the way Drupal presents the content to the page. If the author, publisher and so on will be displayed in the order it was created administratively and styled according to the site's general style guide, creating the look and feel involves adding some CSS to the theme's CSS file(s). Easy! But if the page will distribute the fields in a unique order or include custom behavior (like also displaying other books the user has rated), custom work needs to be done. Hooks in the modules and in the theme enable this work to happen, allowing the page load process to be interrupted and edited.

Ironically, the fact that Drupal enables the creation of a book review content type is also what makes it a framework. In the words of Larry Garfield, Drupal core contributor and member of the Drupal Association Advisory Board:

What Drupal is today is a tool for building a content management system for a variety of different needs. That's an important distinction for someone looking to build a Drupal site to understand. Drupal is not a CMS. It is the framework with which you build your own CMS, to your specifications, to suit your needs. It is a Content Management Framework.

Diving into the syntax of hooks does require programming knowledge and is, in my experience, where the discussion between developers and product owners should end. My developer cohorts and I discuss the technical aspects of implementing hooks: which to use, where to put them, when to call them, how to simplify the code they run, performance issues and caching plans, the decision to use contributed code or write my own. Once the decision to, for example, pull in feeds and display them is made, the “how” discussions begin. (Node.js anyone?) Communicating the issues that arise as the “how” is being implemented and making collaborative decisions with site owners is the fine art of managing development, the place where the conversations begin again. This process is made easier, every time, when everyone involved understands how Drupal works (framework!) and trusts the easy/hard assessments made by the development team.

Hooks create, define and override the Drupal tools used to build information architecture—associating content with other content and creating navigable structure. Nobody wants a Web site to spit out all of the content in one big blob. The primary tools to build information architecture are content types, menus, blocks, taxonomies and views.

  • Nodes, content types and fields give structure to the content. Fields (like author or publisher) make for easy content creation and visual continuity for the user.

  • Menus enable navigation by creating a structure of associations. Menus create a content geography and reveal the paths for exploring it without getting lost.

  • Blocks are boxes of content that can be associated with and displayed in a region, such as the sidebar or footer. These boxes can be filled with any kind of content: nodes and content types, menus, lists, text with markup, output like feeds or unique code-created lists like “most recommended”. There are hooks, of course, to create, control, edit and override blocks, although most blocks are built administratively.

  • Taxonomies are lists of terms that can be associated with content. Most users are familiar with the idea of tagging, associating a blog post (for example) with a list of terms like “coding, biking, cooking or hiking” In Drupal, taxonomies can provide the foundation for more-complex use cases, but associating content is the most common.

  • The Views module is a list-maker, powered by a contributed module. Many complex tasks can be handled by Views, but at its most basic, it's the way to create a list of content in Drupal—for example, “all book reviews posted in the last three months”. Views also can display content using associations, such as “all posts tagged with the taxonomy terms 'apple' and 'spinach', sorted alphabetically”. The lists often are created using the Views administrative interface, but custom code can override the output (hooks!), and entire views can be created in code.

The Drupal framework is a kitchen where, yes, there already are tools in the drawer and ingredients in the pantry. But those tools and ingredients do not define the meals that can be made there. Teams of site owners, stakeholders, project managers, business-goal definers and developers can cook better meals together when Drupal is understood as a framework. Approaching Drupal as a CMS often means bending it to your will: “I want zucchini muffins like my mother used to make; do that.” As a framework, Drupal encourages creating the best, most-elegant recipe within the scope of the endeavor: “Here's some zucchini, what can we do with this?”

Drupal's flexibility may make answering our three questions (how long, how much and can it be done) more time consuming. But in the end, the outcome is far more satisfying.

Diana Montalion Dupuis is a software developer, Web strategist, writer, trainer and hiker who doesn't spend enough time in the mountains. She lives in Austin, Texas, where it is too hot in the summer, and is Director of Development and Professional Services at Four Kitchens.

LJ Archive