2011-12-08

the lay of the (kernel-)land

the main question that's been flying around in my research-head over the last couple months is, "what's actually so different about kernel-space that makes this research, instead of just re-implementing user-space stuff that people have already done?"

if I could write a comprehensive answer to this question, I'd have my thesis, so of course my answer isn't very put-together, but I think the scattered parts I've figured out (mostly thanks to talking to people at the PDL retreat) are concrete enough to talk about here.

environment

it should be no surprise that wrapping a program in a dynamic testing framework is a lot harder when there's no "magic kernel" to sit on the other side of said framework from the thing being tested (in short, the kernel is the thing being tested itself). there are a few possibilities I've thought about for approaching this:

1. running the guest kernel natively on the hardware itself. the testing framework would need to live inside the kernel itself, which presents some nasty challenges: for one, making sure that changing the thing being tested doesn't change the way its bugs show up; for two, it would take a bunch of complicated details to make run at all (such as, how is interrupt delivery controlled? certainly not automatically by the hardware anymore). (another possibility would be to use something like system management mode, which I haven't thought enough about.)

2. simulating the guest kernel. this is what I'm actually doing for the project, using Simics, because it's easiest to actually write code for. controlling the kernel is as simple as asking Simics to call into Landslide once per instruction executed, and also asking it to trigger a timer interrupt when we decide it's time to preempt the currently running kernel thread. the downside, of course, is that simulation is slow, and running code once per simulated instruction makes it even slower.

3. virtualisation. this would be sort of a happy middle-ground, where the guest kernel is running (for the most part) directly on the hardware, so there would be much less performance hit than with full simulation, but also the VMM holds some degree of control over the kernel that the testing framework could make use of. nevertheless, without per-instruction control, it would still require extra trickery both to track exactly what the guest kernel is doing (e.g. when does somebody get added to the runqueue?) and also to decide precisely when to trigger a nondeterminism event. requiring annotations inside the guest kernel would be one approach, but that would have its own limitations in terms of tracking "shared memory accesses" as potential decision points.

exploration strategies

for the most part, the "decision tree" looks about the same as in multithreaded user-space programs, but some challenges arise that are inherent from having the concurrency implementation itself being part of what's being tested. for one, detecting when one thread switches to another is no longer a trivial thing that simply falls out of knowing which process is currently running (the kernel is "the process"), and also, once we decide which thread we want to run, causing that thread to run can't just be done by calling "yield to this thread"; you have to understand what the scheduler needs to have happen to cause that thread to start running. (both of these named problems are already solved in Landslide.)

another interesting related problem arises when considering how to implement partial order reduction (the main algorithm that dBug uses). it works by determining that certain "steps" in each interleaving pattern are "independent" of each other, by virtue of their memory accesses being not in conflict (read/write or write/write). however, since each "step" both starts and ends in the context switcher / runqueue code (which is no longer a magic black box), there will necessarily be memory accesses to the same locations in each. some concessions will need to be made in accuracy to be able to say anything is independent of anything else - this is one of the things I intend to implement over the coming months.

possible applications

one major perspective I gained at the PDL retreat was that the things people want to debug kernels for are different than the things people want to debug userspace programs for. for the most part, the core bits of professional kernels (Linux, ...) are correct, since that's where all the talented programmers look most often; it's where the less talented programmers work - i.e., device drivers - that problems tend to show up.

(the reason I'm getting away with not thinking about this in my project itself is because the case study is Pebbles, the kernel that students implement for 15-410.)

so thinking about device drivers is a big avenue for future development of my subject. instead of just saying, "the timer interrupt is the only source of nondeterminism we need; we'll test for races between threads", the races we look for would be between interrupt handlers, bottom-halves, and threads executing related system calls. in this application, it's less meaningful to try to use a bunch of threads, and more meaningful to focus on the code paths themselves, so the timer interrupt would play second-fiddle to device interrupts (and the data that the device be communicating with those interrupts, such as network packets or disk read buffers).

another possible avenue is to think about is kernels that run on "non-CPU" processors - for example, controller firmware for storage devices (the most striking example I remember discussing) are structured in a concurrent way to service parallel i/o requests. such "kernels" are already tested in simulated environments (so said the guy I was talking with about this), and so instrumenting the simulators with a dynamic race detection framework would not be too big of a step up.

(in summary)

for the most part, these are all "pipe dream"-type thoughts, since I have all of a year to do the project itself and show that it works at all in kernel environments. but of course part of research is justifying its potential broader applications, so this stuff I consider to be the building blocks of my future work section.

2 comments:

  1. I'm not sure if you know Antti Kantee or have been following his work, but he has been musing around your exact same question: "what's actually so different about kernel-space that makes this research, instead of just re-implementing user-space stuff that people have already done?"

    In practice, he reimplemented the NetBSD kernel API as a set of userspace libraries. What this allows is running *unmodified* kernel code as a user process, which in turn simplifies programming and debugging (think about writing unit tests for example, or avoiding reboots to try changes!).

    Additionally, this also allows running complete kernel subsystems as userspace processes, of which filesystems, device drivers and the TCP/IP stack are just a few examples. For example, NetBSD has unit tests of the traceroute(1) tool that run on top of a fake user-space TCP/IP stack; this stack is exactly the same code as the kernel later users. Or, another example: NetBSD also has Fuse-like file systems, without having to reimplement the file system code for each of them.

    For more details:

    http://www.netbsd.org/docs/rump/
    http://www.netbsd.org/docs/puffs/

    ReplyDelete
  2. interesting stuff. In the past I've worked with user-mode linux, which seems like a lot of the same stuff (kernel threads run as unix processes), and yeah - I wonder what it'd be like. I've had my eyes on virtualisation for a while as a future avenue, which would probably be about the same.

    ReplyDelete