You fix the bug locally, ship, and it comes back at 2 a.m. with logs that don’t match what you saw. You try to reproduce and it… won’t. A test goes red one in fifty runs. Threads “randomly” interleave. A network hiccup lands at just the wrong moment. This isn’t just bad luck—it’s nondeterminism leaking into your system.
The antidote is deterministic reproduction: make the system behave the same way from the same inputs—every time—so you can reason, test, and debug without gambling on interleavings or wall-time quirks. In practice, that means four disciplines:
- Make randomness deterministic and logged.
- Make time a dependency you control.
- Capture schedules/inputs at the boundaries.
- Replay those captured facts exactly.
This post is a field guide to building that muscle in plain C. We’ll start with the two easiest wins—deterministic randomness and time—then build toward schedules and record/replay.
Where nondeterminism sneaks in
If you can name it, you can tame it. Common sources to inventory on day one:
- Randomness: unseeded PRNGs,
rand()
/random()
with implicit global state, OS entropy. - Time: wall-clock reads, timers, timeouts, backoff jitter.
- Threads and scheduling: interleavings, races, lock acquisition order, I/O completion order.
- I/O and the environment: network timing, disk latency variance, signals, process IDs.
- Uninitialized data and UB: stack garbage, unsynchronized access, data races.
- Filesystem and directory iteration order, locale/encoding, environment variables.
Your initial goal is to route these through explicit dependencies that you can fix for a run and log for a later replay.
Deterministic randomness you can trust
“Random” must be reproducible on demand. Two rules make that true:
- Use your own explicit PRNG state; never rely on implicit global state.
- Make the seed an input you pick, print, and store.
A small, fast PRNG with explicit state
The exact algorithm matters less than owning the state. The following minimalist xoshiro-inspired generator is simple, fast, and good enough for tests and simulations. It uses a 128-bit state and produces 32-bit values.
#include <stdint.h>
typedef struct {
uint32_t s[4];
} prng32_t;
static inline uint32_t rotl32(uint32_t x, int k) {
return (x << k) | (x >> (32 - k));
}
// SplitMix64 to expand a 64-bit seed into 128 bits of state
static uint64_t splitmix64(uint64_t *x) {
uint64_t z = (*x += 0x9E3779B97f4A7C15ull);
z = (z ^ (z >> 30)) * 0xBF58476D1CE4E5B9ull;
z = (z ^ (z >> 27)) * 0x94D049BB133111EBull;
return z ^ (z >> 31);
}
void prng32_seed(prng32_t *g, uint64_t seed) {
uint64_t x = seed;
uint64_t a = splitmix64(&x);
uint64_t b = splitmix64(&x);
g->s[0] = (uint32_t)a;
g->s[1] = (uint32_t)(a >> 32);
g->s[2] = (uint32_t)b;
g->s[3] = (uint32_t)(b >> 32);
}
uint32_t prng32_next(prng32_t *g) {
const uint32_t result = rotl32(g->s[0] + g->s[3], 7) + g->s[0];
const uint32_t t = g->s[1] << 9;
g->s[2] ^= g->s[0];
g->s[3] ^= g->s[1];
g->s[1] ^= g->s[2];
g->s[0] ^= g->s[3];
g->s[2] ^= t;
g->s[3] = rotl32(g->s[3], 11);
return result;
}
// Uniform double in [0,1)
double prng32_next_unit(prng32_t *g) {
// 24-bit mantissa for simplicity
return (prng32_next(g) >> 8) * (1.0 / 16777216.0);
}
Key properties:
- No globals. You can have many independent streams.
- Seeding via a single 64-bit value; derive full state via SplitMix.
- Reproducible: same seed ⇒ same sequence across platforms (assuming identical integer widths and endianness; if you need cross-endian stability, serialize the seed and state explicitly).
Seeding as an explicit, logged input
Pick a seed deterministically from the command line or environment, print it on startup, and include it in any failure report. That is your “repro key.”
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static uint64_t parse_seed_or_default(int argc, char **argv, uint64_t def) {
for (int i = 1; i < argc; ++i) {
if (strncmp(argv[i], "--seed=", 8) == 0) {
return (uint64_t)strtoull(argv[i] + 8, NULL, 10);
}
}
const char *env = getenv("SEED");
if (env && *env) return (uint64_t)strtoull(env, NULL, 10);
return def;
}
int main(int argc, char **argv) {
uint64_t seed = parse_seed_or_default(argc, argv, 123456789ull);
printf("seed=%" PRIu64 "\n", seed);
prng32_t rng; prng32_seed(&rng, seed);
for (int i = 0; i < 5; ++i) {
printf("%08x\n", prng32_next(&rng));
}
return 0;
}
Guidelines that save hours later:
- Print the seed once at the start of every test/program run.
- When a test fails, include
seed=...
in its error message. Your CI can copy-paste it. - Never use
rand()
/srand()
or hidden global state if you care about reproduction.
Time you control (and can freeze)
Wall time is hostile to deterministic repro: it jumps (NTP), depends on locale/time zone, and is influenced by the machine’s clock. Even monotonic time varies with CPU/OS load and hardware. The fix is to inject a clock abstraction and make it swappable for tests and replay.
A minimal clock interface
Separate “how we read time” from “what we do with it.” In production, use a monotonic clock; in tests/replay, use a fake clock you advance explicitly.
#include <stdint.h>
#include <time.h>
typedef struct clock_iface clock_iface;
struct clock_iface {
// Returns nanoseconds since an arbitrary epoch (monotonic in prod; controlled in tests)
uint64_t (*now_ns)(clock_iface *self);
// Optional: sleep until a deadline (prod); tests can simulate or fast-forward
void (*sleep_until_ns)(clock_iface *self, uint64_t deadline_ns);
};
// Production monotonic clock
typedef struct { clock_iface vtbl; } mono_clock;
static uint64_t mono_now_ns(clock_iface *self) {
(void)self;
struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts);
return (uint64_t)ts.tv_sec * 1000000000ull + (uint64_t)ts.tv_nsec;
}
static void mono_sleep_until_ns(clock_iface *self, uint64_t deadline_ns) {
(void)self;
for (;;) {
struct timespec now; clock_gettime(CLOCK_MONOTONIC, &now);
uint64_t now_ns = (uint64_t)now.tv_sec * 1000000000ull + (uint64_t)now.tv_nsec;
if (now_ns >= deadline_ns) break;
uint64_t delta = deadline_ns - now_ns;
struct timespec ts = { .tv_sec = (time_t)(delta / 1000000000ull),
.tv_nsec = (long)(delta % 1000000000ull) };
nanosleep(&ts, NULL);
}
}
static inline void mono_clock_init(mono_clock *c) {
c->vtbl.now_ns = mono_now_ns;
c->vtbl.sleep_until_ns = mono_sleep_until_ns;
}
// Test/replay clock
typedef struct {
clock_iface vtbl;
uint64_t cur_ns;
} fake_clock;
static uint64_t fake_now_ns(clock_iface *self) {
return ((fake_clock *)self)->cur_ns;
}
static void fake_sleep_until_ns(clock_iface *self, uint64_t deadline_ns) {
fake_clock *fc = (fake_clock *)self;
if (fc->cur_ns < deadline_ns) fc->cur_ns = deadline_ns; // jump instantly
}
static inline void fake_clock_init(fake_clock *c, uint64_t start_ns) {
c->cur_ns = start_ns;
c->vtbl.now_ns = fake_now_ns;
c->vtbl.sleep_until_ns = fake_sleep_until_ns;
}
With this interface, production code accepts a clock_iface*
and never calls clock_gettime
directly. Tests pass a fake_clock
and advance it deterministically. For replay, you can feed the same sequence of time reads you recorded.
Wall vs monotonic vs logical time
Use monotonic time for scheduling and deadlines. Treat wall time as an external I/O input you snapshot and carry separately (e.g., for logging). For deterministic replay of concurrency, a logical clock (e.g., a counter advanced at specific events) can be even better—your code observes exactly the same progression regardless of CPU speed.
A reproducible backoff and timeout policy
Once randomness and time are under control, policies that previously relied on ambient entropy become deterministic too. Example: exponential backoff with jitter for retries.
typedef struct {
prng32_t *rng;
clock_iface *clk;
uint64_t base_ns; // minimum backoff
uint64_t max_ns; // cap
} backoff_policy;
// Returns a deadline_ns for the next attempt, deterministically based on seed and call count.
uint64_t next_retry_deadline(backoff_policy *p, unsigned attempt) {
// Exponential growth with cap
uint64_t exp = p->base_ns << (attempt > 30 ? 30 : attempt);
if (exp > p->max_ns) exp = p->max_ns;
// Jitter in [0.5, 1.0] range
double r = 0.5 + 0.5 * prng32_next_unit(p->rng);
uint64_t delay = (uint64_t)(r * (double)exp);
return p->clk->now_ns(p->clk) + delay;
}
Because both the RNG and the clock are explicit, passing the same seed and advancing the same way yields identical deadlines and retry behavior—locally, in CI, and in replay.
A quick checklist to implement today
- Introduce a small PRNG with explicit state; ban
rand()
in your codebase. - Make the PRNG seed a required, logged input for tests and reproducible runs.
- Wrap time behind an interface; use monotonic for scheduling, fake/recorded clocks for tests/replay.
- Thread every component that needs time/randomness through these explicit dependencies.
More to come: how to capture thread schedules and I/O completions, and how to build a record/replay harness that turns a flaky failure into a single, explanatory run.
Capturing schedules that actually matter
Once randomness and time are under control, the remaining non-repeatability often comes from thread interleavings and I/O completion order. You don’t need to capture every single instruction—only the decision points that change outcomes: lock acquisition order, queue push/pop boundaries, wakeups, and I/O completions.
Two complementary tools:
- Lightweight schedule points your code calls at critical boundaries.
- A recorder that logs those points with ordering, and a replayer that enforces the same order.
A minimal event model
Use a compact schema that is easy to write and read back. JSON Lines (one JSON object per line) is pragmatic for early iterations.
#include <inttypes.h>
#include <pthread.h>
#include <stdatomic.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
typedef enum {
EV_POINT = 1, // named scheduling point
EV_MUTEX_REQ = 2, // optional: before trying to lock
EV_MUTEX_ACQ = 3, // after lock acquired
EV_MUTEX_REL = 4, // after unlock
EV_THREAD_START = 5,
EV_THREAD_END = 6,
EV_IO_COMPLETE = 7 // optional: an I/O completion boundary
} ev_kind;
typedef struct {
FILE *out;
pthread_mutex_t mu;
atomic_uint_fast64_t seq;
} rr_recorder;
static void rr_recorder_init(rr_recorder *r, const char *path) {
r->out = fopen(path, "w");
pthread_mutex_init(&r->mu, NULL);
atomic_store(&r->seq, 0);
}
static void rr_recorder_close(rr_recorder *r) {
if (r->out) fclose(r->out);
pthread_mutex_destroy(&r->mu);
}
static inline uint32_t rr_tid(void) {
// Portable enough for correlation; not a kernel TID
return (uint32_t)(uintptr_t)pthread_self();
}
static void rr_log_point(rr_recorder *r, const char *name, ev_kind k, uint64_t t_ns) {
uint64_t i = atomic_fetch_add(&r->seq, 1);
pthread_mutex_lock(&r->mu);
fprintf(r->out,
"{\"i\":%" PRIu64 ",\"ns\":%" PRIu64 ",\"tid\":%u,\"k\":%u,\"name\":\"%s\"}\n",
i, t_ns, rr_tid(), (unsigned)k, name ? name : "");
fflush(r->out);
pthread_mutex_unlock(&r->mu);
}
Call rr_log_point
at boundaries you care about. In record mode, this produces an ordered trace with index i
and a timestamp so you can correlate with logs.
A replay gate that enforces order
The replayer loads the trace, then each time code reaches a schedule point it waits until the next expected event in the trace matches the thread and name.
#include <ctype.h>
#include <stdlib.h>
typedef struct { uint64_t i; uint32_t tid; ev_kind k; char name[32]; } trace_ev;
typedef struct {
trace_ev *evs; size_t n, cap; size_t cur;
pthread_mutex_t mu; pthread_cond_t cv;
} rr_replayer;
static bool parse_line(const char *s, trace_ev *out) {
// Minimal parser for fields: i, tid, k, name
const char *pi = strstr(s, "\"i\":");
const char *pt = strstr(s, "\"tid\":");
const char *pk = strstr(s, "\"k\":");
const char *pn = strstr(s, "\"name\":\"");
if (!pi || !pt || !pk || !pn) return false;
out->i = strtoull(pi + 5, NULL, 10);
out->tid = (uint32_t)strtoul(pt + 7, NULL, 10);
out->k = (ev_kind)strtoul(pk + 5, NULL, 10);
pn += 8; size_t j = 0; while (pn[j] && pn[j] != '"' && j + 1 < sizeof out->name) { out->name[j] = pn[j]; j++; }
out->name[j] = '\0';
return true;
}
static bool rr_replayer_load(rr_replayer *rp, const char *path) {
FILE *f = fopen(path, "r"); if (!f) return false;
rp->evs = NULL; rp->n = rp->cap = rp->cur = 0;
pthread_mutex_init(&rp->mu, NULL); pthread_cond_init(&rp->cv, NULL);
char *line = NULL; size_t len = 0; ssize_t nread;
while ((nread = getline(&line, &len, f)) != -1) {
trace_ev ev; if (!parse_line(line, &ev)) continue;
if (rp->n == rp->cap) { size_t nc = rp->cap ? rp->cap * 2 : 256; rp->evs = (trace_ev *)realloc(rp->evs, nc * sizeof *rp->evs); rp->cap = nc; }
rp->evs[rp->n++] = ev;
}
free(line); fclose(f);
return rp->n > 0;
}
static bool rr_gate(rr_replayer *rp, ev_kind k, const char *name) {
pthread_mutex_lock(&rp->mu);
for (;;) {
if (rp->cur >= rp->n) { pthread_mutex_unlock(&rp->mu); return false; }
trace_ev *e = &rp->evs[rp->cur];
if (e->k == k && e->tid == rr_tid() && strncmp(e->name, name ? name : "", sizeof e->name) == 0) {
rp->cur++;
pthread_cond_broadcast(&rp->cv);
pthread_mutex_unlock(&rp->mu);
return true;
}
pthread_cond_wait(&rp->cv, &rp->mu);
}
}
This is intentionally small: threads pause at rr_gate
until the next matching event is theirs. With schedule points placed around races, your recorded order is reproduced exactly.
A unified API: record or replay via one macro
Hide mode selection behind a macro so production code calls a single function.
typedef struct {
bool replay;
rr_recorder rec;
rr_replayer rep;
clock_iface *clk; // reuse your clock abstraction
} rr_ctx;
static inline void rr_init_record(rr_ctx *ctx, const char *path, clock_iface *clk) {
ctx->replay = false; ctx->clk = clk; rr_recorder_init(&ctx->rec, path);
}
static inline bool rr_init_replay(rr_ctx *ctx, const char *path, clock_iface *clk) {
ctx->replay = true; ctx->clk = clk; return rr_replayer_load(&ctx->rep, path);
}
static inline void rr_point(rr_ctx *ctx, const char *name) {
if (!ctx->replay) {
uint64_t t = ctx->clk ? ctx->clk->now_ns(ctx->clk) : 0;
rr_log_point(&ctx->rec, name, EV_POINT, t);
} else {
(void)rr_gate(&ctx->rep, EV_POINT, name);
}
}
Usage pattern:
- In record mode, run your flaky test with schedule points around the suspected race; collect
trace.jsonl
. - Flip to replay mode; the same program blocks at
rr_point
calls until the next event matches the recorded order, yielding a deterministic reproduction.
Where to place schedule points
You don’t need many—just enough to disambiguate outcomes. Start with:
- Before/after acquiring contended mutexes.
- Right after enqueue/dequeue in shared queues.
- Right before changing shared state that a different thread observes.
- Immediately when an I/O completion handler fires.
If a failure persists under replay, you’ve captured enough. If it diverges, add a point earlier in the race window.
A tiny end-to-end example
Two threads contend on a flag and a value. We place schedule points to capture their relative order.
#include <pthread.h>
#include <stdio.h>
static int value = 0; static int ready = 0; static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
typedef struct { rr_ctx *rr; } args_t;
static void *producer(void *arg) {
rr_ctx *rr = ((args_t *)arg)->rr;
pthread_mutex_lock(&m);
value = 42;
rr_point(rr, "after_value");
ready = 1;
pthread_mutex_unlock(&m);
rr_point(rr, "after_ready");
return NULL;
}
static void *consumer(void *arg) {
rr_ctx *rr = ((args_t *)arg)->rr;
rr_point(rr, "before_check");
pthread_mutex_lock(&m);
if (ready) printf("%d\n", value);
pthread_mutex_unlock(&m);
rr_point(rr, "after_check");
return NULL;
}
In record mode, you get the exact interleaving via points like before_check
, after_value
, after_ready
. In replay mode, the same relative order is enforced so you can iterate on fixes against a stable failure.
Trace format and hygiene
- Use one line per event; keep it append-only.
- Include an index
i
that monotonically increases to check ordering. - Include thread identity and a small name string.
- Keep names short and consistent; they are your keys in replay.
If you need speed later, keep the schema but switch to a binary encoding; the API above barely changes.
Guard rails and failure modes
- Drift detection: if replay blocks too long at a point, print the expected
(tid,name)
and the actual caller; abort with a clear message. - Timeouts: use your clock abstraction to enforce per-point deadlines in CI so mis-instrumented tests fail fast rather than hang.
- Granularity: don’t wrap every line; start coarse and zoom in only where needed.
I/O capture at the right boundaries
Deterministic reproduction often hinges on I/O that arrives in different sizes or order across runs (network, pipes, disk). Capture at API boundaries you own, not deep in libraries. The goal is to record exactly what your program observed and produced, then substitute those facts during replay.
Design choices:
- Record bytes for inputs (what you
read
/recv
), and optionally for outputs (what youwrite
/send
) if you want round-trip determinism. - To control size/sensitivity, record length + checksum by default, and enable full payload capture with a flag in failing tests.
- Keep a consistent mapping from file descriptors/sockets to human-readable channel names in the trace.
Minimal wrappers for read/write with record/replay
#include <errno.h>
#include <sys/uio.h>
#include <unistd.h>
static const char hex_alphabet[] = "0123456789abcdef";
static void hex_of(const unsigned char *in, size_t n, char *out) {
for (size_t i = 0; i < n; ++i) { out[2*i] = hex_alphabet[in[i] >> 4]; out[2*i+1] = hex_alphabet[in[i] & 0xF]; }
out[2*n] = '\0';
}
static uint32_t fast_checksum32(const unsigned char *p, size_t n) {
uint32_t h = 2166136261u;
for (size_t i = 0; i < n; ++i) { h ^= p[i]; h *= 16777619u; }
return h ? h : 1u; // avoid 0 for visibility
}
ssize_t rr_read(rr_ctx *ctx, const char *ch, int fd, void *buf, size_t cap) {
if (!ctx->replay) {
ssize_t r = read(fd, buf, cap);
if (r >= 0) {
uint32_t sum = fast_checksum32((const unsigned char *)buf, (size_t)r);
char hex[128]; size_t show = (size_t)r < 64 ? (size_t)r : 64; hex_of((const unsigned char *)buf, show, hex);
uint64_t t = ctx->clk ? ctx->clk->now_ns(ctx->clk) : 0;
pthread_mutex_lock(&ctx->rec.mu);
fprintf(ctx->rec.out,
"{\"i\":%" PRIu64 ",\"ns\":%" PRIu64 ",\"tid\":%u,\"k\":%u,\"chan\":\"%s\",\"op\":\"read\",\"len\":%zd,\"sum\":%u,\"sample\":\"%s\"}\n",
(uint64_t)atomic_fetch_add(&ctx->rec.seq, 1), t, rr_tid(), (unsigned)EV_IO_COMPLETE, ch, r, sum, hex);
fflush(ctx->rec.out);
pthread_mutex_unlock(&ctx->rec.mu);
}
return r;
} else {
// On replay, wait for the next I/O event and honor its length; fill buffer with recorded sample pattern if available
if (!rr_gate(&ctx->rep, EV_IO_COMPLETE, "read")) return -1;
trace_ev *e = &ctx->rep.evs[ctx->rep.cur - 1];
(void)e; // assuming parse extended to carry len/sum if needed
// For brevity, simulate by returning 0 (EOF) here; in a real system you’d load payloads.
return 0;
}
}
ssize_t rr_write(rr_ctx *ctx, const char *ch, int fd, const void *buf, size_t len) {
if (!ctx->replay) {
ssize_t w = write(fd, buf, len);
if (w >= 0) {
uint32_t sum = fast_checksum32((const unsigned char *)buf, (size_t)w);
char hex[128]; size_t show = (size_t)w < 64 ? (size_t)w : 64; hex_of((const unsigned char *)buf, show, hex);
uint64_t t = ctx->clk ? ctx->clk->now_ns(ctx->clk) : 0;
pthread_mutex_lock(&ctx->rec.mu);
fprintf(ctx->rec.out,
"{\"i\":%" PRIu64 ",\"ns\":%" PRIu64 ",\"tid\":%u,\"k\":%u,\"chan\":\"%s\",\"op\":\"write\",\"len\":%zd,\"sum\":%u,\"sample\":\"%s\"}\n",
(uint64_t)atomic_fetch_add(&ctx->rec.seq, 1), t, rr_tid(), (unsigned)EV_IO_COMPLETE, ch, w, sum, hex);
fflush(ctx->rec.out);
pthread_mutex_unlock(&ctx->rec.mu);
}
return w;
} else {
if (!rr_gate(&ctx->rep, EV_IO_COMPLETE, "write")) return -1;
// In strict replay, you may also enforce the same lengths as recorded.
return (ssize_t)len;
}
}
Notes:
- Channels (
ch
) let you distinguishclient:42:read
vsdisk:wal:write
and filter during analysis. - Default to logging checksum + small hex sample for PII-safety; flip a flag to store full payloads only for failing tests.
- In production-grade systems, you’ll parse/store
len
/sum
/payload
in a structured way alongside schedule events, not hand-parse JSON.
Mapping fds to channels
Assign a name when you create an fd and keep it in a small map.
#include <stdlib.h>
#include <string.h>
typedef struct { int fd; char name[64]; } fd_name;
typedef struct { fd_name *a; size_t n, cap; } fd_registry;
static void fdreg_init(fd_registry *r) { r->a = NULL; r->n = r->cap = 0; }
static void fdreg_put(fd_registry *r, int fd, const char *name) {
if (r->n == r->cap) { size_t nc = r->cap ? r->cap*2 : 32; r->a = (fd_name *)realloc(r->a, nc * sizeof *r->a); r->cap = nc; }
r->a[r->n].fd = fd; strncpy(r->a[r->n].name, name, sizeof r->a[r->n].name - 1); r->a[r->n].name[sizeof r->a[r->n].name - 1] = '\0'; r->n++;
}
static const char *fdreg_get(fd_registry *r, int fd) {
for (size_t i = 0; i < r->n; ++i) if (r->a[i].fd == fd) return r->a[i].name; return "fd";
}
Use fdreg_get(&R, fd)
to pass a stable channel name into rr_read/rr_write
.
Deterministic fault injection (and recording it)
Flaky failures often need a rare error to align with a racy window. Make that error happen deterministically by injecting it via your PRNG, and record the decision so replay does not depend on the RNG.
typedef struct { prng32_t *rng; rr_ctx *rr; double p; } injector;
// Returns true when we should fail this operation
static bool inj_decide(injector *inj, const char *site) {
if (!inj->rr->replay) {
bool fail = (inj->p > 0.0) && (prng32_next_unit(inj->rng) < inj->p);
rr_log_point(&inj->rr->rec, fail ? site : site, EV_POINT, inj->rr->clk ? inj->rr->clk->now_ns(inj->rr->clk) : 0);
return fail;
} else {
// In replay, the trace encodes outcomes; gate enforces order and we read outcome if encoded.
(void)rr_gate(&inj->rr->rep, EV_POINT, site);
return true; // assume recorded as failure for demonstration
}
}
Apply to syscalls such as read
, write
, or to logical operations like “simulate network drop” by returning EIO
/ETIMEDOUT
at deterministic points.
Checkpoints for fast iteration
Replaying from the beginning for every experiment is slow. Introduce a lightweight checkpoint that captures:
- PRNG states for all streams.
- Clock state (fake clock current
ns
). - rr trace cursor (current
i
). - Application state necessary to rerun the hot region (in-memory snapshot or a serialized state specific to your subsystem).
Then add controls to jump to i = K
and restore state to the corresponding snapshot. With a small amount of plumbing, you can “bisect” interleavings quickly—e.g., binary search the smallest prefix that still triggers the bug.
Stepping and breakpoints in replay
Add two environment-driven knobs to your replayer:
RR_BREAK_AT=i
: stop whencur == i
, print context, and wait for a keypress.RR_STEP=1
: require manual continue between events.
This turns your harness into a crude but effective time-travel debugger for concurrency.
Storage and performance considerations
- Rotate traces per test and compress them (JSONL.gz or a binary framing) to keep CI artifacts small.
- Separate “schedule” and “payload” channels so most runs record only schedule; enable payload capture on failure.
- Make logging non-blocking: pre-allocate buffers; batch writes; optionally use a lock-free ring with a draining thread in record mode.
Test integration and CI that actually reproduces
Deterministic repro is only useful if it lives where bugs do: in your tests and CI. Wrap seeds, clocks, record/replay, and failure artifacts behind a tiny harness so every test can opt in without bespoke plumbing.
A minimal test harness
#include <stdio.h>
#include <stdlib.h>
typedef struct {
uint64_t seed;
const char *mode; // "record" or "replay" or "off"
const char *trace; // path to trace.jsonl
int iterations; // number of seeds to try in CI sweep
} rr_config;
static uint64_t read_u64_env(const char *k, uint64_t def) {
const char *s = getenv(k); return (s && *s) ? (uint64_t)strtoull(s, NULL, 10) : def;
}
static const char *read_str_env(const char *k, const char *def) {
const char *s = getenv(k); return (s && *s) ? s : def;
}
static rr_config rr_config_from_env(void) {
rr_config c = {0};
c.seed = read_u64_env("SEED", 123456789ull);
c.mode = read_str_env("RR_MODE", "off");
c.trace = read_str_env("RR_TRACE", "trace.jsonl");
c.iterations = (int)read_u64_env("RR_ITERS", 1);
return c;
}
static int run_test_once(const rr_config *cfg) {
// Init building blocks
mono_clock mc; mono_clock_init(&mc);
fake_clock fc; fake_clock_init(&fc, 0);
clock_iface *clk = (strcmp(cfg->mode, "replay") == 0) ? (clock_iface *)&fc : (clock_iface *)&mc;
prng32_t rng; prng32_seed(&rng, cfg->seed);
rr_ctx rr = {0};
if (strcmp(cfg->mode, "record") == 0) rr_init_record(&rr, cfg->trace, clk);
else if (strcmp(cfg->mode, "replay") == 0) {
if (!rr_init_replay(&rr, cfg->trace, clk)) { fprintf(stderr, "failed to load trace\n"); return 2; }
}
// Your test body here: pass &rr and &rng and use rr_point/rr_read/rr_write
// return 0 on success, non-zero on failure
return 0;
}
int main(void) {
rr_config cfg = rr_config_from_env();
printf("seed=%" PRIu64 " mode=%s trace=%s iters=%d\n", cfg.seed, cfg.mode, cfg.trace, cfg.iterations);
for (int i = 0; i < cfg.iterations; ++i) {
uint64_t seed_i = cfg.seed + (uint64_t)i;
rr_config it = cfg; it.seed = seed_i;
int rc = run_test_once(&it);
if (rc != 0) { fprintf(stderr, "FAIL seed=%" PRIu64 "\n", seed_i); return rc; }
}
return 0;
}
With this skeleton:
- Local repro:
SEED=987 RR_MODE=record RR_TRACE=out.jsonl ./test && RR_MODE=replay ./test
. - CI sweep:
RR_ITERS=100
tries 100 consecutive seeds. When one fails, CI storesout.jsonl
and the failing seed. - Fast iteration: rerun locally with
RR_MODE=replay SEED=<failing>
to get the same outcome.
Stabilizing flaky tests (without hiding bugs)
- Make every test seedable; fail with the exact
seed
andtrace
location. - Quarantine + fix: when a test flakes, pin its seed (temporarily) to the failing one and add replay to CI until the fix lands.
- Never “retry until pass” without capturing the failure; that destroys repro value.
Artifacts and observability
- Store trace files, a text log with seed/mode, and optionally a compressed payload bundle.
- Summarize counts: events per thread, I/O ops, longest wait, last event before divergence.
- Index traces by test name and commit hash to search regressions.
Limitations and practical edges
- Kernel and system calls you don’t wrap remain nondeterministic; wrap what matters at your boundaries.
- Thread identities are not stable across runs; use your own logical IDs where possible.
- Timing-dependent kernels (e.g., epoll readiness coalescing) still require schedule gates around handlers for stable replay.
- Payload privacy: default to checksums; add a per-test allowlist for full payload capture to avoid leaking secrets in CI artifacts.
- Overhead: small but non-zero. Keep schedule points minimal; batch recorder I/O; disable in performance-critical prod builds.
A compact API surface you can standardize
prng32_t
: explicit-state PRNG;prng32_seed
,prng32_next
,prng32_next_unit
.clock_iface
:now_ns
,sleep_until_ns
;mono_clock
,fake_clock
.rr_ctx
: record or replay;rr_init_record
,rr_init_replay
,rr_point
.- I/O wrappers:
rr_read
,rr_write
(andrr_recv
,rr_send
variants if needed). - Trace: JSONL by default; binary framing optional later.
Document a small policy:
- Every test prints
seed=...
andmode=...
on start. - Failures always attach a trace; replays always verify exact event order.
- Schedule points only at contended locks, queues, and I/O handlers.
Validation workflow that catches regressions
- Build with sanitizers (
-fsanitize=thread,undefined
) and run under replay to keep the schedule fixed while you hunt logic bugs. - Differential runs: record once; replay under
-O0
,-O3
, and with different compilers—determinism lets you isolate optimizer-sensitive issues. - Stress within determinism: increase concurrency, but keep order fixed to test invariants consistently.
Closing checklist (printable)
- All tests accept
SEED
,RR_MODE
,RR_TRACE
; print them once. - Replace
rand()
with an explicit-state PRNG; log the seed. - Wrap time; use monotonic for scheduling, fake clock for tests.
- Add schedule points only where contended; keep them named and stable.
- Record I/O at your boundaries with channels and checksums; payloads on failure.
- Store artifacts in CI; make repro one command.
- Add drift/timeouts in replay; fail fast on divergence.
Final thoughts
Determinism isn’t about making your system boring—it’s about making failures boring. When the same seed, time, and schedule always lead to the same outcome, you get to debug logic instead of dice rolls. Start small: seed your PRNG, wrap time, add two schedule points around a flaky region, and record one trace. The first 3 a.m. incident you solve in a single, calm replay will pay for the whole exercise.