Winning Race Conditions

2015-10-15

PLDI evaluation

Running experiments for PLDI has begun in earnest. My evaluation plan calls for 800 CPU-days of testing:

80 P2 thread libraries * 6 test cases
79(?) Pintos kernels * 2 test cases

638 codebase+testcase pairs total

For each one, a 10-hour control experiment, a 10-cpu * 1-hour "live" experiment, and a 10-cpu * 1-hour "data race false negative" experiment (don't worry, the paper will explain it... should it get published!).

(80*6+79*2)*3*10 = 19,140 cpu-hours = 797.5 cpu-days.

And 200 CPUs to do it with.

2015-09-09

big green button

Hello internet, it's been a while.

Tonight I'm having a "1% moment" of research. That is, 99% of the time, I either have my head on the grindstone, or am endlessly worrying and guilting myself about not getting enough done, being an impostor, etc.; but that other 1% is why I'm still a grad student. Because sometimes at 5 in the morning I finish writing mindless automation glue code, finish patching horribly broken anonymous student code, and finish debugging the bugs in my bug-finding software (ha), and finally reach a state where I can hit a big green button marked "GO RUN THE EXPERIMENT" and watch the computer do something absolutely frickin' amazing.

My current project is an extension of Landslide that automatically searches for new preemption points (PPs) during the course of a systematic test, adds new state spaces to explore using those PPs, and figures out with state space estimation which state spaces are most likely to finish testing in a given CPU budget. I'm calling it "iterative deepening" by analogy with the chess AI technique, and you can find my latest talk slides here for more details.

But mostly the purpose of this post is for me to share some eye-candy. Here's what Landslide looks like when it's feeling victorious.

The key thing to note here is that bugs are only found in state spaces with data-race preemption points, which only Landslide's iterative deepening framework is capable of identifying and using. IOW, these bugs would be missed by any other systematic testing tool that interposed only on thread library calls.

I've finally got a conference deadline in my sights where getting accepted seems realistic. It's been a looong build-up to this point. Keep your eyes peeled.

2012-10-20

What is a "data race" and when is a race not a data race?

I've been meaning to write about this since my post series about Rust (in particular here, where I wrote "while data races are no longer possible, race conditions in general still are" about the RWARC library). In general, Rust statically guarantees freedom from data races, though not freedom from all races. But what does that mean?

A data race is when multiple threads concurrently access the same memory location where at least one access is a write. "Concurrently" here could mean either literally at the same time (threads run on different CPUs) or abstractly at the same time (threads interleave with each other on the same CPU); i.e., no synchronisation primitive enforces that one thread's access completes before the other begins.

Thread 1	Thread 2
if (p != NULL)
	p = NULL;
output(p->data);

Data race detectors, such as Eraser and Helgrind, analyse threads' mutual-exclusion and happens-before relationships to identify unsafe concurrent accesses like these. But it's possible to stop accesses from being concurrent without enforcing correct behaviour:

Thread 1	Thread 2
mutex_lock(m);
bool ok = p != NULL;
mutex_unlock(m);
	mutex_lock(m);
	p = NULL;
	mutex_unlock(m);
mutex_lock(m);
if (ok) output(p->data);
mutex_unlock(m);

Now the data race is gone, but the bug has simply become a higher-level race condition. Most literature calls this an "atomicity violation" (and some literature even uses "race" to mean exclusively data races).

You might think this code looks silly, but if you're working in a project with many layers of abstraction and function/module boundaries, this kind of mistake can be all too easy to make, and data race detectors are powerless to find them.

Consider this real-world example. When I started at Mozilla this summer, Rust 0.2 had recently shipped, and its release notes mentioned that it was "Helgrind-clean" (meaning no data races existed). Yet the Rust runtime contained this code:

bool rust_task::blocked() {

    mutex_lock(this->lifecycle_lock);

    bool is_blocked = this->state == task_state_blocked;

    mutex_unlock(this->lifecycle_lock);

    return is_blocked;

}

Sure, accessing the field was safely protected by the mutex, but once it dropped the lock and returned, all bets were off as to whether the returned value was still accurate or not. (I fixed several bugs related to this, and removed this function entirely.)

In a similar vein, Rust's type system guarantees that concurrent tasks cannot share state but instead must use message-passing to communicate, which precludes the possibility of data races completely by enforcing happens-before relationships on all data accesses (or in the case of this post, by enforcing mutual-exclusion relationships). Yet it's still possible to write nondeterministic programs in Rust (using select2, failure propagation, etc), and so race conditions are still possible.

The moral of this story is that data races are only one of many types of races, and though many tools exist for finding them, just because one guarantees absence of data races does not mean your code is completely concurrency-safe. Not to say these tools aren't useful, but they often fail where more sophisticated race-finding techniques could succeed, and even still, no automated race-finding tool can substitute for a careful human brain when reasoning about concurrency.

What exactly is a "race condition", anyway?

A friend of mine is taking the operating systems class at UMD, in which the second project is to implement inter-process signals. He noted a peculiarity in the specification: that processes are not woken up immediately if they receive a signal while blocked (e.g. on child processes, on keyboard/disk input). As a result, it could be completely random whether or not a process receiving a signal gets killed immediately or sleeps forever.

He discussed this with the professor, and they disagreed over whether this nondeterminism constituted a "race condition" or not. After all, the specification allows for signals to fail to wake up processes under certain circumstances, so there's nothing wrong about implementing it that way. On the other hand, a kernel whose signalling mechanism always wakes up processes in bounded time (i.e., finitely long -- whereas waiting for keyboard input could take forever) could provide stronger guarantees about inter-process communication.

In my interpretation, both arguments don't tell the entire story. For starters, race conditions don't necessarily entail wrong behaviour; I've seen plenty of "benign" race conditions with comments along the lines of "if X and Y race, Z will happen, and this is OK". Benign races aside, though, "race condition" to me means "unexpected behaviour occurs nondeterministically". So, if you want to be precise, it's important to talk about race conditions with respect to certain expectations.

Someone writing a userspace program for this kernel who didn't realise that signals might never get taken (and hence produced code that sometimes accidentally sleeps forever) could say they were bitten by a race in the kernel. But if they'd read the spec carefully, they might've written code that handled the nondeterminism more robustly. They could say the spec's nondeterminism made it less useful than other possible specs, but it wouldn't be fair to blame the particular implementation of this spec for being buggy.

In short, I would say the specification itself has a race condition in it, but implementations thereof don't. What's important is who holds the expectations and who nondeterministically breaks them.

2012-09-26

Rust (0): Index and Conclusion

This four-post series on Rust is intended to introduce you to the language, to teach you about Rust's cool language features, and to give a debriefing of what I contributed to it this summer.

These posts are targetted for an audience with some knowledge of programming language design principles. You should be lightly familiar with both systems programming languages such as C++ and with functional languages such as Haskell or ML, and preferably strongly skilled in at least one or the other domain.

Do feel free to skip ahead, if you're already familiar with parts of the language, or to bail out early, if you're not interested in an involved tour of concurrency primitives. All the same, I hope you get something out of some or all of these posts.

Primer - an introduction to the language's syntax, memory model, and concurrency model
Linked Task Failure - advanced parallel programming and error handling with tasks (my first project)
Typesafe Shared State - an overview of the region system and a parallelism library that makes heavy use of it
Typesafe Shared Mutable State - using trickery with Rust's type system to achieve a completely safe interface for common concurrency idioms (my second project)

I'd like to close with an argument for why I think Rust is the "language of the future" for systems programming.

Rust's strong static type system relieves programmers from worrying about many types of errors they should never have to. NULL pointer crashes, memory management errors, surprising implicit type coercions, and dynamic cast exceptions don't exist anymore. Meanwhile, features like closures and higher-order functions (missing in C++ (until very recent versions)), algebraic datatypes and parametric polymorphism (both missing in Go), and traits (existential types; a combination of haskell-style typeclasses and OO-style interfaces) allow you to concisely express ideas that would otherwise involve a lot of legwork in certain "conventional" languages.
Unlike other functional languages, however, Rust has heavy focus on performance as well. Stack-allocated data lets you often avoid dynamic allocation overhead and garbage collection (even closures can sometimes be entirely on the stack). The region system and borrow checker allow for type-and-memory-safe aliasing of arbitrary data with no runtime overhead. Explicit copyability as part of the type system lets you be aware of when expensive copies might occur.
Finally (and this is the big one, for me), Rust's type system includes a concurrency-aware memory model. Forbidding unprotected shared state and using message-passing over pipes as the main communication mechansim means programmers no longer have to worry about data races, and is also friendly to massively-parallel applications where cache-line contention is a serious worry. The use of noncopyable types means the message-passing library can safely assume all communication will be one-to-one, which allows for a blazing fast implementation under the hood. Noncopyable types also give other strong guarantees, such as the safety of ARCs and the fact that two tasks cannot deadlock when communicating over a single pipe.

Hopefully I've gotten you excited about using Rust for safe + performant parallel programming (or maybe several months from now, when its features and syntax are more stable). And to the Rust community: Thanks, it's been a blast.

2012-09-25

Rust (4): Typesafe Shared Mutable State

This post is a continuation of shared immutable state. Before I introduce how we do safe shared mutable state, I'll take a moment to show why unprotected shared mutable state is dangerous.

Dangers of Shared State

If you're a functional programmer, you're probably used to a language in which nested data structures are allocated in several heap cells, each of which is garbage-collected, so multiple users can freely alias into the same data, implicitly copy to make changes, and so on.

Rust's approach is somewhat different: it focuses on stack-allocation, avoiding expensive implicit copies, and predictable performance. In fact, heap-allocation only occurs when you write the @ or ~ sigil; and, absent @-pointers, Rust's representation semantics don't involve garbage collection at all. Instead:

Data types are representated with interior types, meaning data types are embedded directly within one another rather than using pointer indirection. You can, of course, create borrowed pointers to such types and pass them between functions.
Stack-allocated and ~-allocated values are owned data, which get eagerly freed/deinitialised immediately upon going out of scope or being overwritten.
Rustic data structures can have in-place mutability, indicated with the mut keyword. While also supported by many other functional languages, in Rust it presents new difficulties with aliasing pointers because of point #2 above.

With such a C/C++-like representation model, the prospect of sharing mutable state among multiple actors is a lot more dangerous. To show why, let's say we added a data-race-enabling function to ARC's interface:

    fn get_mut<T: Const Send>(arc: &a/ARC<T>) -> &a/mut T

Then we can commit badness like:

    let arc: ARC<Option<~int>> = ARC(Some(~31337));

    let arc2 = clone(&arc);

    do task::spawn |move arc2| {

        // Might print "Some(~31337)". Might print "None". Might segfault.

        io::println(fmt!("%?", *get(&arc2)));

}

    // Frees and deinitialises the owned pointer inside the ARC.

    *get_mut(&arc) = None;

    // (But what if this runs after the other task determines the data

    //  is Some, but before it dereferences the contained pointer??)

With sufficient cleverness, this can even be harnessed to implement arbitrary type coercion. (See my solution here.)

Reader-Writer ARCs

The ARC already existed when I arrived at Mozilla, but there was no similar (and safe) solution for the state being mutable. I created the RWARC, with a reader-writer lock inside, to fill this gap.

You create them just like you create ARCs:

    fn RWARC<T: Const Send>(data: T) -> RWARC<T>

    fn clone<T: Const Send>(arc: &RWARC<T>) -> RWARC<T>

But when using them, instead of getting an unlimited-use reference to the data inside, you give the interface a closure to run on the data, and it runs the closure for you with the rwlock held in the correct mode.

    fn read <T: Const Send>(arc: &RWARC<T>, blk: fn(&T))

    fn write<T: Const Send>(arc: &RWARC<T>, blk: fn(&mut T))

The key difference is that the region associated with the data pointer is the region of the closure, rather than some arbitrary region defined by the caller. This allows read() and write() to enforce that the contained reader-writer lock is always held in the correct mode when references to the data exist.

Now we can fix the example from before.

    let arc = RWARC(Some(~31337));

    for 5.times {

        let arc2 = clone(&arc);

        do task::spawn |move arc2| {

            do read(&arc2) |state: &Option<~int>| {

                // Long-running reads on state still happen in parallel.

                io::println(fmt!("%?", *state));

}

}

}

    do write(&arc) |state: &mut Option<~int>| {

        // Exclusive write access. No other aliases to state can exist concurrently.

        *state = None;

}

Note that while data races are no longer possible, race conditions in general still are. (I mentioned earlier that shared mutable state introduces nondeterminism.) Here, anywhere between zero and five "None"s will be printed.

The compiler will, of course, reject code that tries to cheat the interface:

    let escaped_state;

    do write(&arc) |state| {

        escaped_state = state; // ERROR: reference not valid outside of its lifetime

}

A brief informal justification of safety:

The Const restriction still enforces that readers only see deeply immutable state. Also, even with mutable state, it still prevents cycles from being created, because the RWARC itself does not have the Const kind.
References to the shared state cannot escape the closure called by read() or write(). In effect, the region system statically enforces that the lock must be held in order to access the state.

The Concurrency Primitives You Know and Love

Condition Variables

The RWARC also comes with some other features to remind you of home (if "home" to you means old C-style concurrency primitives you fought off race conditions with back in the day). We have condition variables:

    fn write_cond<T: Const Send>(arc: &RWARC<T>, blk: fn(&mut T, &Condvar))

    fn wait(cond: &Condvar)

    fn signal(cond: &Condvar) -> bool

    fn broadcast(cond: &Condvar) -> uint

These work as you might expect. Like the &mut T reference, the Condvar reference can only be used inside the closure (i.e., while the lock is held).

    let arc = RWARC(~[]);

    let arc2 = clone(&arc);

    do task::spawn |move arc2| {

        do write_cond(&arc2) |state,cond| {

            // Poor man's message-passing. Of course, pipes are much

            // faster; rwarcs and condvars are built on top of pipes.

            vec::push(state, ~"hello there!");

            signal(cond);

}

}

    do write_cond(&arc) |state,cond| {

        while state.len() == 0 {

            wait(cond);

}

        io::println(vec::pop(state));

}

(The more seasoned concurrency hackers among you might now be wondering what if you wanted to associate multiple conditions with the same state? That can be done too -- gritty details are in the docs.)

Downgrade (or, Now You're Just Showing Off with the Region System)

(Do feel free to zone out for this section.)

If you're used to being able to atomically "downgrade" write access into read access without letting other writers through in the meantime, you can do that here too. (I'm presenting this feature mostly just to show off more stuff you can do by combining the region system with noncopyable types.)

    // Calls a closure which will write, then downgrade, then read.
    fn write_downgrade<T: Const Send>(arc: &RWARC<T>, blk: fn(RWWriteMode/&a<T>))

    // Converts a "write permission" token to a "read permission" token.

    fn downgrade<T: Const Send>(token: RWWriteMode/&a<T>) -> RWReadMode/&a<T>

    fn write<T: Const Send>(token: &RWWriteMode<T>, blk: fn(&mut T))

    fn read <T: Const Send>(token: &RWReadMode <T>, blk: fn(&T))

Here, the RWWriteMode and RWReadMode are noncopyable "permission tokens" that allow the user to write or read, and downgrade() is a function that consumes the write token and wakes up any readers waiting on the rwlock. Since the tokens are noncopyable, the caller cannot still have write permissions after calling downgrade() (which would, of course, result in data races).

The "RWWriteMode/&a" syntax indicates an opaque data structure with region pointers inside. While the write mode token is passed by ownership (so that it can in turn be surrendered to downgrade()), its scope is still constrained by the associated region, which means it can't escape from the closure passed to write_downgrade(). And downgrade() converts a write mode token to a read mode token with the same region, so the latter can't escape either.

Complex as the above functions may seem, using the interface simply looks like this:

    do write_downgrade(&arc) |token| {

        do write(&token) |mutable_state| {

...

}

        let token = downgrade(move token);

        do read(&token) |immutable_state| {

...

}

}

Unwrap

Finally, RWARCs (ARCs too) also now have a mechanism to get your data back out again.

    fn unwrap<T: Const Send>(arc: RWARC<T>) -> T

Of course, it wouldn't be valid to reclaim ownership of the data while other tasks might still have aliases to it. Instead, unwrap() blocks the calling task until its reference is the only reference alive, and then takes ownership of the data instead of freeing it. (To avoid deadlock, subsequent callers to unwrap() on the same ARC immediately fail.)

This adds expressivity in two ways: it relieves you from having to deeply-copy the shared data if you need to own it (which would be extra problematic if it had noncopyables inside), and it automatically synchronises with the ARC's other users. You could use this to implement a fork-join pattern, like so:

    let arc = RWARC(some_data);

    for num_cpus().times {

        let arc2 = clone(&arc);

        do task::spawn |move arc2| {

            process_data(arc2); // might read, write, whatever

}

}

    let modified_data = unwrap(move arc); // blocks on all child tasks at once

    // do more of the algorithm, etc.

All this without ever once copying the data.

This about wraps up the contributions I made this summer at Mozilla. In my next post I'll conclude the series with a summary of why I like Rust so much.

2012-09-22

Rust (3): Typesafe Shared State

Previously I introduced Rust, talking about syntax, pointer types, and light-weight parallelism and message-passing. I also wrote about my own summer project, flexible failure propagation between tasks, talking about some more advanced programming techniques with Rustic tasks.

Through it all you might have been wondering, "No shared state?! I see the value in eliminating data races, but isn't it sometimes what you want?" Yes! That's what this post is for.

Consider: When spawning a bunch of tasks to parallelly process a large data structure, it would be a shame to have to deeply copy the whole thing and send one copy over a pipe to each task (expensive in both space and time). You'd want each task to be able to alias the same data instead.

Shared Immutable State

Rust's standard library includes the ARC, which stands for Atomically Reference-Counted object. The ARC serves as a wrapper-handle to some data you wish to share; rather than copying the data itself, you instead copy just the handle, which just involves atomically incrementing a reference count for the contained data.

To create an ARC:

    // Given ownership of some data, wraps it in an ARC.

fn ARC<T: Const Send>(data: T) -> ARC<T>

The polymorphic type T is constrained by the Send kind (which I mentioned in my primer post), so it can only be used with data of types that you could also otherwise send over pipes, and also by the Const kind, which means the data can have no mutable interior fields (the type has to be deeply immutable to guarantee no data races).

Like pipe endpoints, the ARC is a noncopyable type. New handles to the same ARC cannot be freely created (for that would bypass the reference counting mechanism); they must be made using the rest of the interface. (ARC also uses destructors internally, so the moment an ARC handle leaves scope, the reference count gets dropped. When the count hits zero, the data will be freed as well.)

And to use an ARC:

// Creates a new handle to the ARC.

    fn clone<T: Const Send>(arc: &ARC<T>) -> ARC<T>

    // Get an immutable pointer to the underlying data.

fn get<T: Const Send>(arc: &a/ARC<T>) -> &a/T

You'll notice the use of &-pointers (borrowed pointers) in this interface. In clone(), this means the argument ARC is passed by-reference rather than by-ownership to create the new handle. The interface of get() introduces some new syntax, &a/T, which to explain I'll need to introduce regions.

As I hinted at in my primer post, borrowed pointers are statically analysed to ensure they don't outlive the data they were borrowed from. This is done by associating a region with the borrowed pointer to denote its lifetime (which is tied to some lexical scope or inherited from other data's lifetime).

Mostly, regions exist behind-the-scenes, since the compiler can infer them when needed. Sometimes it is useful, though, to explicitly write that two regions will be the same -- the &a/T syntax denotes a borrowed pointer to a T with some lifetime a. Because the same region variable is used to borrow the ARC itself ("&a/ARC<T>"), the compiler knows to enforce in get()'s caller that the returned pointer cannot outlive the associated ARC handle. get() is said to be region-parametric; that is, the region variable a can be instantiated with whatever region is appropriate at each call-site.

Examples

Here's a code snippet that demonstrates basic ARC usage. I create an ARC with a BigDataStructure inside, clone a second handle, and then in two parallel tasks get references into them.

    fn init() -> BigDataStructure   { ... }
    fn access(x: &BigDataStructure) { ... }

    fn main() {
        let arc1 = ARC(init());   // refcount == 1
        let arc2 = clone(&arc1);  // refcount == 2
        do task::spawn |move arc2| {  // gives child ownership of 2nd handle
            let x2: &BigDataStructure = get(&arc2);
            access(x2);  // in parallel with the below
            // arc2 gets dropped. BigDataStructure might get freed here.....
            // (note: x2 can no longer be accessed)
        }
        let x1: &BigDataStructure = get(&arc1);
        access(x1);  // in parallel with the above
        // arc1 gets dropped. .....or it might get freed here.
        // (note: x1 can no longer be accessed)
    }

Here are some examples of ways the type system prevents unsafe usage.

First, the compiler won't let me bypass the reference-counting mechanism:

    let arc1 = ARC(init());  // refcount == 1
    let arc2 = arc1; // ERROR: copying a noncopyable value
    // double free :(

If ARC handles were copyable, two destructors would run here and the reference count would get decremented too many times.
The compiler will also stop me from using the reference from get() after the associated ARC handle went out of scope (which is legal in a language like C++, and would result in a use-after-free):

    fn broken_get(arc: ARC<BigDataStructure>) -> &a/BigDataStructure {
        // note the unconstrained region variable ^
        let x = get(&arc);
        return x;  // ERROR: reference not valid outside of its lifetime
        // note: the arc handle would get dropped here(??)
    }
    access(broken_get(ARC(init())));  // use after free :(
Finally, I will try to surrender ownership of my ARC handle by sending it over a pipe (perhaps to another task), while still holding on to a pointer I borrowed from it with get().

    let (sender,receiver) = pipes::stream();
    let arc = ARC(init());
    let x = get(&arc);      // NOTE: loan of local variable granted here
    sender.send(move arc);  // ERROR: moving out of local variable
                            //        prohibited due to outstanding loan
    access(x);  // unknown whether arc is still alive(??)

But the compiler's borrow checker stopped me, because the "loan" I had created earlier was still in scope.

Safety

Because Rust intentionally has no language features to support shared state, the ARC library provides it by using unsafe code internally. Given that unsafe code "shifts the burden of proof from the compiler to the programmer", how can we know the interface is right?

While we are working on a proof of the region system's correctness in general, we don't have a proof for this interface in particular (though I'd be curious how one would look!). Nevertheless, we can be quite confident in the ARC's safety because of the guarantees that Rust's language features provide:

The Const kind restriction and the immutable pointer returned by get() ensure that once inside an ARC, data can never be modified. This makes data races impossible, and also precludes the possibility of constructing a cyclic reference among ARCs. (Reference counting is a safe memory management strategy only in absence of cycles.)
The use of noncopyable ("linear") types for the ARC handles ensures that the reference count exactly matches the number of handles, and therefore the associated data will only be freed when all handles have left scope.
The regioned type signature of get() ensures that a reference to the contained data must be outlived by its associated handle (and hence, by #2, outlived also by the contained data itself).

Stay tuned for a follow-up post explaining a still more advanced interface I created for safely sharing mutable state between tasks.