2012-10-20

What exactly is a "race condition", anyway?

A friend of mine is taking the operating systems class at UMD, in which the second project is to implement inter-process signals. He noted a peculiarity in the specification: that processes are not woken up immediately if they receive a signal while blocked (e.g. on child processes, on keyboard/disk input). As a result, it could be completely random whether or not a process receiving a signal gets killed immediately or sleeps forever.

He discussed this with the professor, and they disagreed over whether this nondeterminism constituted a "race condition" or not. After all, the specification allows for signals to fail to wake up processes under certain circumstances, so there's nothing wrong about implementing it that way. On the other hand, a kernel whose signalling mechanism always wakes up processes in bounded time (i.e., finitely long -- whereas waiting for keyboard input could take forever) could provide stronger guarantees about inter-process communication.

In my interpretation, both arguments don't tell the entire story. For starters, race conditions don't necessarily entail wrong behaviour; I've seen plenty of "benign" race conditions with comments along the lines of "if X and Y race, Z will happen, and this is OK". Benign races aside, though, "race condition" to me means "unexpected behaviour occurs nondeterministically". So, if you want to be precise, it's important to talk about race conditions with respect to certain expectations.

Someone writing a userspace program for this kernel who didn't realise that signals might never get taken (and hence produced code that sometimes accidentally sleeps forever) could say they were bitten by a race in the kernel. But if they'd read the spec carefully, they might've written code that handled the nondeterminism more robustly. They could say the spec's nondeterminism made it less useful than other possible specs, but it wouldn't be fair to blame the particular implementation of this spec for being buggy.

In short, I would say the specification itself has a race condition in it, but implementations thereof don't. What's important is who holds the expectations and who nondeterministically breaks them.

No comments:

Post a Comment