Zack's Kernel News


Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

By Zack Brown

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Correcting Spontaneous RAM Errors

Brian Gordon said that he was interested in adapting Linux for the aerospace field. One problem is called Single Event Upsets, or SEUs. This is when RAM bits change state because of high-energy particles passing through the device. This can happen in various types of aircraft and spacecraft. He wanted to figure out an error-correction technique that would detect SEUs and render them no longer a problem for the system.

Andi Kleen pointed out that this was also a problem for regular servers, not just those operating in space. For these servers, ECC (Error Correction Codes) RAM is used. These, Andi said, would correct single-bit errors and detect multi-bit errors. He also pointed out there were other sophisticated solutions available.

He also said, "Lower end systems which are optimized for cost generally ignore the problem, though, and any flipped bit in memory will result in a crash (if you're lucky) or silent data corruption (if you're unlucky)."

Chris Friesen also said that in his field, telecommunications, they used ECC RAM; he also suggested checksumming data and validating it on use. He suggested that Brian go through the EDAC code in the kernel. EDAC is a project on SourceForge [1] specifically intended to make use of ECC RAM.

Brian replied that EDAC definitely looked good, but that he was primarily interested in the "optimized for cost" systems that Andi had mentioned. He thought that even for these systems, Linux should do as much as it could to correct those relatively spontaneous RAM errors.

Andi replied that as far as he knew, there was currently no implementation for this in Linux - all of those RAM error-correcting features were currently dependent on using ECC RAM hardware. But, he also said that even if Brian were able to code up something in software, it would probably need the userspace application itself to be aware of the problem and need to be coded in such a way as to cooperate with the kernel to fix it. So there wouldn't be a general solution that would just make everything happy.

Brian seemed somewhat discouraged by this, although he said he was continuing his research. It does seem as if native Linux RAM error correction that is independent of specialized hardware could be a long time coming.

Improving Hibernation

Nigel Cunningham had an idea for a new way to read and write a memory image for hibernation. He wrote the algorithm and posted it, hoping to uncover any major flaws before spending a lot of work trying to implement it.

The whole issue is a lot more controversial than it might appear at first glance. The ideal hibernation approach would shut down the system as quickly as possible, so as to bring it up again in the identical state, as quickly as possible. The problem is that system state is tricky to handle, and figuring out how to do everything quickly is also tricky. Developers can have opposing viewpoints on the best approach, which can be very difficult or impossible to resolve without attempting an actual implementation.

In this case, although Nigel thought his approach was relatively simple, Pavel Machek, who co-maintains the current hibernation code with Rafael J. Wysocki, thought Nigel's approach was "way too complex to be practical."

Rafael also thought Nigel's approach was overly complicated, because of a nuance in the timing of how the page cache was preserved. The page cache speeds up file access times by keeping file data in memory once it's been accessed, in case that data is needed again soon. Both Nigel and Rafael agreed that the page cache needed to be preserved or else the system would be less responsive after resuming from hibernation. But, Rafael felt that saving the page cache later than Nigel had planned would be far simpler. Rafael's main point was that Nigel shouldn't be trying to preserve absolutely everything in RAM, but that saving 80% of the contents of RAM would result in a fast, simple solution. The quest to save absolutely everything was the reason Nigel had timed the page cache preservation the way he had; giving that up would make it impossible to completely preserve the state of RAM but would see plenty of other benefits.

Nigel argued that preserving the full state of RAM was very important for speed. This is a good example of the way in which several developers' conceptions of a problem will lead them to very different conclusions. How can these be resolved without just doing the code? Many important breakthroughs, including the Git revision control system, have resulted from one developer thinking something was possible, when other developers thought it wasn't.

Splitting and Joining Files on Low-Resource Systems

Nikanth Karthikesan posted some code to add a couple of system calls. He wanted to split a file into smaller files without requiring any extra space on the device. With most filesystems, he could just punch holes in the larger file and use the holes as the boundaries of the new files. It was just "moving metadata around," as he put it. But with the FAT filesystem, this method wouldn't work. FAT required extra space on the device to copy the data from the larger file into the smaller split pieces. Karthikesan's new system calls, sys_split() and sys_join(), alert FAT to the fact that only metadata changes are needed.

Hirofumi Ogawa said the patch needed to be reworked to account for cache management. He also said that, although he was fine with adding the features, he thought it should only be done if there would be actual, real live users using them.

Nikanth pointed out that one big use for this feature would be multimedia editing, where individual frames could be converted into video, and full video could be cut into pieces more easily than could be done currently. Also, he said, it would make it easy to enlarge and shrink files from the middle, instead of just at the end.

David Pottage was very enthusiastic about the prospects for video editing, saying:

Video files are very big, so a simple edit of removing a few minutes here and there in an hour-long HD recording will involve copying many gigabytes from one file to another. Imagine the time and disc space saved, if you could just make a COW copy of your source file(s), and then cut out the portions you don't want, and join the parts you do want together.

Your final edited file would take no extra disc space compared with your source files, and though it would be fragmented, the fragments would still be large compared with most files, so the performance penalty to read the file sequentially to play it would be small. Once you decide you are happy with the final cut, you can delete the source files and let some background defrag demon tidy up the final file.

But, David added that he'd proposed this kind of feature in the past and been "shouted down" by folks arguing that this sort of feature should naturally be implemented in userspace. Not everyone seemed convinced by these arguments, however, and the discussion trailed off shortly afterwards; so it's unclear whether Nikanth's code will be adopted or not.

Compiling over sshfs

Dmitry Torokhov complained that a recent kbuild fix had caused any kernel compile taking place over sshfs to be painfully slow. He'd been compiling over sshfs for years up until then with no problem, but after that one patch, compile times were in the toilet.

Michal Marek replied that this exposed a flaw in the way Make handled some data - causing it to be recalculated over and over. He said he'd look into it.

INFO
[1] EDAC Project: http://bluesmoke.sourceforge.net/