Zack's Kernel News

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Spamming the Kernel Mailing List

Spammers recently figured out a clever way to get through the linux-kernel mailing list's careful filters. The majordomo configuration on the server involved various private aliases with special posting privileges, so the spammer realized it would be possible to post to the list using those special aliases - and it worked! The spam was sent directly through to the subscribers, without going through any filtering. Once Gene Heskett pointed out the problem, Matti Aarnio was able to ditch all those back-end aliases and plug the hole. One interesting element is that the list is open - anyone can post to it - so the barrier for any regular user who wants to submit a bug report is low. The flip side is that spammers can post to it too. A common technique to minimize spam on a mailing list is to allow only subscribers to post. Because linux-kernel doesn't do that, Matti and the other server administrators have to find other means to reduce the amount of spam that shows up on the list. A variety of "secret sauce" mechanisms, as Matti puts it, including Bayesian filters, seem to work fairly well most of the time. Some spam still gets through, but probably just a tiny, tiny fraction of the attempts made.

ioctls versus SysFS

There was some interesting advice recently on when to use ioctls versus a SysFS interface. Neshama Parhoti was working with some folks to write a new kernel driver, and they knew that SysFS was the preferred way to let users configure the running kernel, but they were concerned that a SysFS interface might have a significant performance effect. Their driver might use this interface as much as every five milliseconds, so Neshama asked whether ioctl should be preferred in that instance.

Arnd Bergmann said that the general question of which interface to use was a complex one, but he did say there was no speed penalty for choosing SysFS; so in this case, Neshama's group should choose the standard interface and not worry about taking a performance hit. In more general terms, he also said, "One rule of thumb is that if you already require a character device, using ioctl is the right answer, but you shouldn't create a character device if all you want to do over it is a single ioctl operation."

Extending versus Forking a Linux Driver

Atul Mukker said that the LSI MegaRAID team (including himself) wanted to enhance the MegaRAID driver to support the next generation of host bus adapters (HBAs). HBAs are used to connect a system to storage devices, so MegaRAID support would support the storage devices that relied on those HBAs. Atul estimated that up to 80% of the driver would be rewritten to provide the interfaces needed. Pretty much no new functionality would be added, only the interfaces would change. The response was interesting: Christoph Hellwig said that the best thing would be to just fork a new MegaRAID driver off of the old one and factor out common code in the future if they wanted to. Matthew Wilcox added, "My biggest concern is that you'll do something to fix a bug in the new hardware and inadvertently create a bug for some old piece of hardware."

How To Get a Filesystem into Linux

Sage Weil asked that Ceph, the distributed filesystem, be merged into the official tree. He said that even though it wasn't "production-ready," other filesystems like Btrfs and Exofs had been merged early, and having Ceph in the main kernel would encourage new users and testers.

Linus Torvalds gave his explanation of why he didn't want to pull Ceph just yet. He said his default response to a pull request from a new filesystem was to say no. If the pressure wasn't coming from a wide array of users, or from actual distributions, that meant there just wasn't enough interest to justify it.

Identifying the True Kernel Version Number

One relatively recent feature is the ability to tell the version number of the running kernel. Paul E. McKenney recently added to this feature, with some scripting that would show not only the official version number but identify exactly where your kernel falls between versions if you've built a kernel directly from the Git repository. Also, it would show whether you built from a pristine tree or a tree that you'd modified on your own.

While not terribly valuable to regular users, this feature is very valuable to developers, who might boot with any number of kernels and sometimes forget exactly which kernel they are running at any given time. One interesting aspect of this code is how tricky it was to implement. Config data is something that gets picked over by various code and scripts, including unpublished scripts written by regular users for their own purposes. It's important that the configuration information not be mistaken for corrupt data by those other tools. So, the new data will look like this: 2.6.33-01836-g90a6501-dirty. At James Cloos's request, Paul also added a feature to make the -dirty text - the portion indicating that you're edited the sources yourself before compiling - optional.

Identifying ARM CPUs on a Running System

Tomasz Fujak posted some patches to make performance events on ARM architectures available in human-readable form in SysFS. This could be interesting for users, but could also be used by automated tools such as Perf.

In the course of discussion, it came up that different CPUs might use different names for essentially the same event; and Peter Zijlstra suggested that /proc/cpuinfo could be used to identify the CPU, in order to give events generic names in Tomasz's SysFS files. But Russell King broke in with, "CPU identification has become a fairly murky business on ARM that the information exported from /proc/cpuinfo can no longer precisely identify the CPU itself. For example, we just treat Cortex A8 and A9 as `ARMv7' because, from the kernel's point of view, they're the same."

Peter suggested fixing /proc/cpuinfo so that it accurately reported the different CPUs. But Russell replied that modern ARM CPUs no longer identified themselves in clearly distinct ways.

Instead, they had a set of registers that identified specific features available on that CPU.

Not only that, but the meaning of these registers changed between ARMv6k and ARMv7, and Russell fully expected the meanings to change again in the future.

But, he said, identifying the specific CPU in use shouldn't be the real concern to users in general. Instead, users need to know practical information, like which ARM ISA level is supported or which debug model is implemented - all of which can be determined, regardless of the specific CPU being used.

Hopefully he's right. He gave a more comprehensive explanation of the ARM CPU identification situation at the end of the thread:

"What you have is a main ID register up until ARMv6, which has about four different encodings. On some CPUs, this is the only ID register offered, and within that subset, some different CPUs (e.g., implemented by different manufacturers, or indeed the same manufacturer) have the same ID register value, despite being rather different.

"From ARMv6k and later, we have a different ID scheme, where we have about 10 32-bit registers giving detailed information about various aspects of the CPU - including five 32-bit registers for details about the instruction set.

"We know that some of the meanings of these registers have changed their meaning - and I don't think there's a way to identify which meaning should be applied to the registers (it seems to require reading lots of different documents to sort out what CPUs implement which method.)

"Frankly, it's a mess, and when you look at implementations, it turns out to be unreliable."