Download the authoritative guide: Cloud Computing 2019: Using the Cloud for Competitive Advantage
When Linux crashes, users don't get a Blue Screen like they do on Windows. Instead, Linux generates an "oops" -- a crash signature that can help developers to figure out what went wrong.
The feature may have a silly name, but it's increasingly serious business.
"Linux calls it 'oops,' but it's basically equivalent to a Windows 'Blue Screen,'" Arjan van de Ven, of Intel's Open Source Technology Center, told InternetNews.com "It's kind of the same thing in terms of what causes it and what it does, except we don't make it blue -- we just print the message."
Van de Ven runs the Kerneloops.org project site himself, although the collection mechanisms of oops detection and reporting are mostly automated. Kerneloops chiefly collects oops records from a client installation that is available to Fedora, OpenSUSE and Debian users.
Such features are growing every more useful as the market for Linux grows and the OS continues finding its way into the hands of non-technical business and consumer users. Red Hat's Fedora Linux includes the oops client by default, for instance.
"A lot of people don't know how to detect a problem, what to send, or where to send it too," Van de Ven explained. "If you can just click a button, it's so much easier and people are more inclined to do it."
Fedora Project leader Paul Frields told InternetNews.com that the Kerneloops package automatically delivers the messages the kernel dumps into a repository the kernel maintainers can use to prioritize, diagnose and fix problems.
"Fedora is involved because we track the kernel very aggressively," Frields said. "The Kerneloops capability also supports our dedication to a healthy cooperation with upstream software providers like the kernel developer community. It leverages the widespread use of Fedora for the direct benefit of that community, who can see measurable results of their work and shift resources as needed to target frequent or important issues."
Kerneloops also collects records from the Linux Kernel Mailing List (LKML) -- the key technical discussion list for kernel bugs and design. The project also sends out a list of the top problems to the LKML on a periodic basis.
Van de Ven noted that if there is enough data about a specific problem, kernel developers often tend to go after it to fix it.
"In general, kernel developers are open to Kerneloops, since the more reports I have, the more data they have," Van de Ven said. "If you have one report, it could just be a fluke, but if you've got 500 reports of the same pattern, you know it's a real bug."
As a result, Van de Ven sees Linux developers fixing bugs thanks to those reports -- thereby making an impact on overall kernel quality. The exact numbers are difficult to quantify, however, as the number of reports that Kerneloops.org gets on any given kernel release varies, as does the occurrence of repeating oops reports.
"We are fixing the bugs that have a lot of people hitting them," Van de Ven said. "If you look at the number of unique bugs, the numbers can be confusing. For the 2.6.25 kernel, there were 1,300 bugs, only half of which only happened once. We do fix a lot of bugs, but if you look at what we fix, it's the ones that actually matter."
On the current 2.6.27 Linux kernel release candidate, Van de Ven is already seeing some early trends on the top oopses. The big one right now involves a problem when a USB drive is removed while still in use, he said -- a condition currently responsible for up to five of the top 20 oopses.
"At this point, it's the hottest bug," he said. "In a few months from now it, might be something different."
The effort to improve overall Linux kernel quality has increasingly found itself in the limelight. Recent efforts by the Linux Foundation aimed to simplify the task of contributing to the kernel in the hopes of improving the quality of driver code from vendors, among other things.