Zack's Kernel News



The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching ten thousand messages in a given week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Our regular monthly column keeps you abreast of the latest discussions and decisions, selected and summarized by Zack. Zack has been publishing a weekly online digest, the Kernel Traffic newsletter for over five years now. Even reading Kernel Traffic alone can be a time consuming task.

Linux Magazine now provides you with the quintessence of Linux Kernel activities, straight from the horse's mouth.

Get Ready for ext4

In a massive flame war spanning hundreds of mailing list posts, the Linux kernel developers considered how to proceed with the development of the ext3 filesystem. The catalyst was a post from Mingming Cao, announcing an effort by RedHat, ClusterFS, IBM, and BULL, to convert ext3 from a 32-bit filesystem to a 48-bit filesystem, thus increasing the maximum size of an ext3 filesystem from 8 terabytes to 1024 petabytes.

In the course of discussion, it came out that other developers were working on implementing extents for ext3. Filesystem extents are a way to keep data contiguous on disk, instead of spreading it through disparate blocks.

All of these large low-level changes coming from various sources have raised questions about whether it is appropriate to call this work `ext3', or if perhaps it is time to fork off an `ext4' project.

On the one hand, these are potentially very good and useful features. Users want them, and if done correctly, they would have a dramatic impact on speed and size, two issues that matter a lot when it comes to filesystems.

On the other hand, ext3 has become a bloated mess, difficult to maintain, difficult to fix, and saddled with questionable features like large inodes that are only valuable to a small percentage of ext3 users with extremely large datasets. Piling additional features on top of this already unsteady structure may only worsen its drawbacks.

Linus has said that he favors the `ext4' idea, and this has inspired a massive outcry from many quarters. At first blush, you might not think there would be so much at stake. Linus has said several times that he has nothing to say against any particular proposed ext3 feature; development can continue in its current path; everything is the same; except instead of modifying ext3, the developers would copy the code to a new `ext4' tree, and continue development from there. So why the outcry?

First of all, there are no actual ext4 users out there testing the code. Once the fork occurs, the ext4 developers will lose their entire user base, consisting by now of millions of systems all over the world.

Another issue has to do with how to port all of ext4's enhancements and bug fixes back to ext3 at the proper time. Who will do the work? And why should so much work be necessary? And wouldn't this involve an inevitable divergence between two very similar code-bases? How could any shared ext3/ext4 code be maintained without it becoming ever-increasingly complex?

In fact, it seems that all of the ext3 maintainers opposed forking off a next version. So why do it?

Linus's answer is similar to things he's said before about other projects, and it has to do with a basic change in how a project and its source code are viewed. In fact, his answer is really just to throw away the questions. Backporting is not necessary, he says, the important bugs will be fixed in ext3, and filesystem users won't care whether they have the absolute latest features. .

Patch Formating Policies

Eric W. Biederman attempted to modify the git code management system to allow a patch's true author to be expressed in a more relaxed way. For instance, this change to git would let the user express the patch's author within the body of the email, instead of only at the very top of the email message. But Linus Torvalds brought the hammer down on this proposed modification of git, saying, "From the very beginning of git, I tried to make it extremely clear that there is never any guessing going on. We don't use `heuristics' except as a pure optimization: i.e., a heuristic can have a performance impact, but it must never EVER have semantic impact."

He added, "If the new git-applymbox just takes random lines from the body of the email, and decides that they may be authorship information, then that is a BUG. The `From: ` line in the middle of an email may well be about somebody having discovered the bug, and we're quoting him as part of the explanation. It does NOT mean that it's about authorship."

Eric agreed to handle things in the way that Linus descibed, but he was slightly annoyed that an earlier patch of his, which had been heading in this direction, had slipped passed Linus's radar and made it into the git tree, thus causing Eric to waste more time on the current work.

Linux und IRC

Linus Torvalds recently asked folks to prefer the git mailing list git@vger.kernel.org over the IRC channel. He said, "I hate IRC," and pointed out that the mailing list had an open posting policy, meaning anyone could post their questions whether they were subscribed or not. And if the git mailing list follows the traditions of the linux-kernel mailing list, anyone responding to questions will automatically CC the original poster.

The tradition of having an open development mailing list is one that has been fought for. If anyone can post their questions, that means anyone can post their spam as well. For the most part, the mailing list administrators for git, linux-kernel, and the hundreds of other mailing lists hosted on the `vger' servers, do an unbelievable job of preventing spam from getting through to the list. But it does take a lot of work and constant vigilance, and is not perfect; and so every once in awhile someone (or a group of someones) suggests allowing only subscribed members to post to linux-kernel. But the list administrators and Linus are adamant in the policy that everyone in the world should be able to easily voice their bug reports and other experiences on linux-kernel. And now as we've seen, the same applies to git as well.

Not all Linux mailing lists follow this practice. Spam is a huge problem, and not all mailing list administrators have the time to fight it, especially when there is such a simple solution as only allowing list members to post to the list. But also, and more fundamentally, not all Linux contributors agree with Linus about the best way to conduct a development mailing list. Because of the diversity of the Linux development model itself, not only are these other ideas able to realize themselves in practice, but they are able to wait in the wings as working alternatives, in case Linus's preferred method finds that the spam gets too much out of hand and some change needs to be made.

Inode Slimming

Theodore Y. T'so decided to see about shrinking the kernel's inode struct. This would have an impact not just on ext3's memory usage, but on that of all filesystems, because the inode struct is used by them all. Ted came up with several pieces of data that could reasonably be taken out of the inode struct, though he acknowledged that, "sweeping through all of the code which uses these variables to move them would be a major code change." He invited folks to help him do this, and a number of people, including Alexander Viro, jumped on board.

Ten days later, with support from Linus, he posted some large and invasive patches, accomplishing a number of the items he had presented earlier. Not all of his changes were precisely on target, and a number of folks had technical comments, but the overall response was very favorable. There was no dancing and jubilation in the streets, but folks pitched in to help clean up and improve the patches.

Ultra-Wide-Band Support

Inaky Perez-Gonzalez, on behalf of Intel, has announced their project to implement support for future hardware that is compliant with the WiMedia Ultra Wide Band (UWB) and the Wireless USB standards. UWB is a very close range wireless networking technology, optimized for in-room use. While the hardware is still unavailable or hard to come by, Inaky did invite Linux people to join the project and help code it up.

Kernel Colors

Some standards require large official bodies to debate over years. Others standards, on the other hand, are completed after just a couple emails on linux-kernel.

The decision of whether to standardize in-kernel spelling of the word to "color" or "colour" was made by David Woodhouse, when Andrew Morton pointed out that one of his patches led to the use of both spellings in the same block of code.

In his fix to the color/colour question, David chose to go with the "color" spelling, because the code in question used the form "color" in a public API, while the alternative "colour" spelling only appeared as an internal variable name.

The most rigid interface won the day in this case. Or at least, it's winning the day so far. As of this writing, there are still hundreds of occurrences of "colour" in the kernel, versus over 2000 occurrences of "color."