Zack's Kernel News

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Eliminating Time-Consuming Boot Messages

Mike Travis pointed out, "with 4096 processors in a system, and the console baudrate set to 56K, the startup messages will take about 84 minutes to clear the serial port." A lot of those messages, he went on to say, were redundant, and he posted a patch to eliminate a big pile of them.

No one had any objections to Mike's idea, although Andi Kleen and Ingo Molnar disagreed somewhat on which information should be displayed. In particular, Ingo felt that giving a full message set for the first CPU and only one line of text per CPU thereafter was fine; anyone who ran into trouble could just boot in debug mode and find the problem from the more verbose output.

But Mike protested, "One problem with just setting the loglevel high enough to output debug messages, is you get literally 100's of thousands of lines of meaningless information. We waited over 8 hours for a system with 2k cpus to boot in debug mode, and it never made it all the way up."

Nevertheless, he coded a compromise patch that Ingo liked and that tried to at least include information relevant to any failure that might occur during boot.

Kernel Development Policy Growth

Some folks were recently debating an issue of how best to use git, and part of the debate focused on users' ability to revise patches that had already been incorporated into a git tree, to include "Signed-Off-By," "Acked-By," and other such lines.

Various folks advocated doing that for completeness, but Linus Torvalds weighed in, saying he didn't like that practice: "It's wrong to even give credit to some late-comer that pipes in after the patch has already made it into somebody's tree. If they didn't comment on it while it was passed around as a patch on mailing lists, what's the point? By the time it's in somebody else's published tree, any `ack' is worthless, and that person should simply _not_ get credit for being late to the party."

This came as a bit of a surprise to folks, especially those who had been advocating keeping kernel changelogs as complete as possible, even after the changes had been incorporated into a tree.

Linus also specifically did not enjoin people from continuing to edit changelogs; but the general attitude among the participants in the conversation after he'd stated his preference was that from then on, no one should feed patches up to his kernel that had been modified in that way.

I address this issue because it's interesting to me how nuanced and open-ended these process decisions can be. Linus may have intended everyone to start doing it his way, or he may not. Everyone may have gotten the idea that they should do it that way from now on, or they may not.

However, if the issue ever comes up again, certain people probably will quietly inform the transgressor that it is best not to edit changelogs after the fact, and that is how it will be until someone thinks of a really good reason why the other way is better; then they'll fight about it.

Status Of The Big Kernel Lock

The big kernel lock (BKL) continues to be expunged from various parts of the kernel in a multi-year effort that seems to be making a lot of progress. In some cases this involves simply recognizing that the BKL isn't needed in a given piece of code. Thomas Gleixner took it out of the x86 microcode, saying it "is a worthless exercise as there is nothing to wait for." Thomas has also been combing through a lot of other code and removing the BKL hand over fist. He's not alone. Jan Blunk has been examining what looks like every single filesystem, looking for places where the BKL isn't needed anymore. And a bunch of other folks, including Alan Cox and other mighty kernel people, are participating.

Where did all these cases originate in which the BKL can just be removed as serving no purpose? Originally the debate over the BKL had a lot of controversy, in part because people said it would be insanely difficult to root it out of wherever it happened to be. It couldn't be replaced by equivalent code because the whole point was to use a softer, gentler locking mechanism instead of just locking up the whole kernel for the duration the BKL was held. And the softer, gentler mechanisms had to accomplish different things, depending on what they were locking; so it was impossible to just run through the code and replace the BKL with a drop-in fix.

The reason it's become a lot easier now is because of the ongoing effort to "push" the BKL down into small, specific areas of the kernel, where each occurrence can then be replaced by any one of a variety of alternatives. For instance, Jan's effort was to push the BKL out of the generic filesystem code and into each individual filesystem, where it would be clearer how removing it would affect that specific code. At the rate of the current assault against the BKL, it's hard to imagine the fight lasting much longer; but I think some parts of the kernel are still particularly resistant to taking it out.

Dual-Licensing Some Of The Kernel Code

Mathieu Desnoyers wanted to relicense some of the tracepoint code so that instead of the whole being GPLed, part would be dual-licensed under the GPL and LGPL and part would be dual-licensed under the GPL and BSD. He asked all contributors for licensing permission of their code under this new scheme, the purpose of which was to allow the code to debug non-GPL applications.

This is a direct result of the "Signed-Off-By" protocol that's been in place for a few years, ever since various intellectual property issues forced Linus Torvalds and others to comb through the kernel sources and mailing list history to try to prove that various elements of the kernel had been developed cleanly and without violating other people's intellectual property. With the advent of these patch-tracing protocols, it's now a trivial matter to identify exactly where any given piece of code came from.

A number of people approved Mathieu's plan immediately, but Ingo Molnar did not approve, saying that because the code had been developed as part of the kernel as a whole and licensed GPL version 2 only, it wouldn't be possible to simply add licenses, regardless of author approval. But it was pointed out in the discussion that the authors of the code can release their contributions under alternative licenses, so long as they also release them under the GPL version 2. The GPL doesn't prevent anyone from using additional licenses if they want. Nevertheless, by the end of the discussion, Ingo still had not given his assent, and Mathieu ended the thread with a plea for Ingo to grant his permission.

In theory, Mathieu could bypass the need for Ingo's approval by taking out all of his contributions and recoding them as cleanly as possible. Then Ingo would no longer have copyrighted material in that portion of the code and the relicensing could proceed. In practice, this could be a tall order, depending on how much of Ingo's code is in there.

A git-quilt Hybrid Tool From Space

Catalin Marinas and some other folks have implemented StGit (stacked git), a Python script that attempts to layer some of the features of quilt on top of git and thus merge the power of the two tools. Quilt was developed by Andrew Morton as a way to preserve his own high-efficiency workflow patterns without having to redesign those habits to use git. Quilt essentially lets users push and pop patches onto and off of a stack so they can be applied in an appropriate order, but also so any given patch can be removed easily if problems start to show up after other patches have already been applied on top. StGit implements that same push/pop approach but implements them as git commits. Now, in addition to the quilt-like behavior, the patches can also be managed via the full power of the git revision control system. StGit is not yet a fully robust, standalone tool so much as an auxiliary set of features that can be used in conjunction with existing git repositories.