Architecture
This document records Marionette's foundational correctness contract. If a future feature weakens this contract, it needs an explicit design discussion. For scheduler, network, invariants, or liveness work, read TigerBeetle Lessons first. For network API work, read Network API Direction. For disk work, read Disk Fault Model before writing code.
Determinism Contract
Given the same Marionette version, Zig version, target platform, user code, simulation options, and seed, a Marionette simulation must produce the same declared result and byte-identical Marionette trace across repeated runs. The guarantee applies only to behavior routed through Marionette-controlled authorities: simulated time, seeded randomness, disk, network, future scheduling, and explicit trace events. Marionette does not guarantee stability for host wall-clock time, OS thread scheduling, stack or heap addresses, pointer identity, unordered map iteration, external syscalls, data read from real devices, or behavior from dependencies that bypass the simulator. A nondeterminism leak is a correctness bug, not a flaky test.
Current State
Phase 0 has:
World, which owns one simulated clock, one seeded PRNG, and one trace log as harness-owned simulation engine state.Clock(.production)andClock(.simulation).Env, one concrete harness-facing bundle that suppliesstd.Io, recorders, disk, clock, random, and tracer capabilities.Control, the harness-facing counterpart for simulator-only controls.- A seeded
Randomwrapper. - A text trace format with a version header and global event indexes.
mar.runCase, the primary stateful scenario runner, and lower-levelmar.runfor world-only scenarios. Both execute a scenario twice and compare traces.- Run names, tags, and
RunAttribute, which make expanded run facts replay-visible in traces and failure summaries without losing scalar value types.runAttributeis the preferred direct constructor;runAttributesFromremains available for scalar-only config structs whose field names are stable exported metadata keys. mar.Check, a named post-scenario check hook for Phase 0 invariants.mar.runCaseandmar.StateCheck, which let checks inspect structured scenario state initialized fresh for each replay attempt.mar.UnstableEventQueue, a fixed-capacity deterministic event queue sketch for stable(ready_at, event_id)-style ordering.mar.UnstableNetwork, a fixed-topology deterministic network sketch with per-link queues, seeded packet loss, tick-aligned latency, process up/down state, directed link filters, simple partitions, and stable(deliver_at, packet_id)delivery order.mar.Endpoint(Message), an app-facing typed process endpoint returned by simulation and production setup.mar.Disk, a lower-level disk capability for sector-orientedread/write/syncplus path-level metadata and lifecycle operations.mar.SimDisk, a deterministic disk simulator with logical paths, sector-aligned reads/writes, sparse in-memory sectors, deterministic latency, operation ids, trace events, read/write IO errors, corrupt reads, scripted sector corruption, and crash/restart simulation for pending writes.mar.DiskControl, the harness-facing fault, corruption, crash, and restart authority produced bySimDisk.control().parseSeed, which accepts decimal seeds and 40-character Git hashes.- Fixed-seed trace comparison tests.
- Many-seed deterministic fuzz-style tests.
- An AST-based tidy linter for obvious nondeterministic calls, including
simple const aliases such as
const time = std.time;.
Phase 0 does not yet have:
- A scheduler.
- Event-by-event invariant checking.
- Liveness checking.
- Seed shrinking.
- Syscall interception.
IO Strategy
Marionette is a library-first simulator. User code should pass explicit
authorities at the top of the program instead of reaching for host globals.
The intended storage application shape is to accept std.Io, a root
std.Io.Dir, and optionally mar.Recorder. Code should accept all of
mar.Env only when it genuinely needs Marionette-specific capabilities such as
simulated random hooks or clock access.
Marionette should not auto-detect the environment from globals, environment
variables, thread-locals, or build flags.
For Phase 0, Marionette still owns small interfaces for time and randomness
because they are needed now and they are not fully solved by std.Io. For disk
files, the public teaching surface is now std.Io: Env.io() supplies a
deterministic backend in simulation and the host backend in production. For
networking, app code receives typed Endpoint(Message) handles while the
std.Io.net subset continues to mature. See
std.Io Direction for the destination architecture. The
migration plan is:
- Keep Marionette's public effect surface narrow while
std.Iois unstable. - Model disk and network behind adapters that can wrap
std.Iowhen its shape settles. - Avoid promising compatibility with arbitrary direct
std.fs,std.net, or OS calls. - Track Zig master and expect API churn before Zig 1.0.
If std.Io changes, Marionette should absorb that churn inside adapters, not
make every user rewrite their simulation tests.
Time Model
Phase 0 time is an integer nanosecond virtual clock. There is exactly one
clock authority per World, and simulated code should receive that authority
instead of calling std.time.
Current behavior:
now()reads the world's current simulated timestamp.tick()advances by the world's configured tick duration.runFor(duration)advances by whole ticks.sleep(duration)onSimClockis currently an immediate deterministic advance because there is no scheduler yet.
Future scheduler behavior must preserve the same authority: sleeps, deadlines, timers, retries, network latency, and disk latency all route through the world's clock. A scheduler may advance time to the next event, but it must not introduce a second clock.
Randomness Model
There is exactly one seeded PRNG per World. Every simulator choice must draw
from it: packet latency, disk latency, BUGGIFY, crash timing, workload
generation, scheduling choices, and future shrink decisions.
Current World.unsafeUntracedRandom() exposes a raw deterministic
std.Random view for rare cases that need the full standard API. Draws
through that view are deterministic, but not automatically traced. The unsafe
name is intentional. Simulator decisions should use traced helpers such as
randomU64(), randomBool(), and randomIntLessThan().
Direct std.crypto.random, unseeded randomness, /dev/urandom, wall-clock
seeding, and host entropy are banned inside simulated code. The tidy linter is
the first guard. Twice-and-compare trace replay is the backstop. A future
paranoid mode should make simulator-incompatible effects fail loudly.
Smallest User Program
The target shape is deliberately close to ordinary Zig dependency passing:
const std = @import("std");
const mar = @import("marionette");
fn client(env: anytype) !u64 {
const latency_ns = try env.random.intLessThan(u64, 1_000_000);
try env.clock.sleep(latency_ns);
return latency_ns;
}
test "single request is replayable" {
var world = try mar.World.init(std.testing.allocator, .{ .seed = 0x1234 });
defer world.deinit();
const sim = try world.simulate(.{});
const latency_ns = try client(sim.env);
try sim.env.record("client.request latency_ns={}", .{latency_ns});
}
The same composition-root pattern is now used for networked examples: build a
simulation, pass sim.env plus node-scoped Endpoint(Message) handles into
the production-shaped code, and keep sim.control in the harness.
Production Cost And BUGGIFY
Fault hooks must not pollute production hot paths. The Zig shape is an environment method:
if (try env.buggify(.drop_packet, .percent(20))) {
return error.PacketDropped;
}
In simulation, env.buggify draws from the world's PRNG and records the
decision and rate. In production, env.buggify returns false and the branch
should fold away in optimized builds. Users call buggify because application
code knows domain-specific fault points that a generic simulator cannot infer.
Marionette decides whether the hook fires; user code owns the effect, such as
dropping a packet, delaying an operation, or returning a simulated storage
error. This is the Zig replacement for FoundationDB-style BUGGIFY macros.
BUGGIFY contains the current API shape and the remaining
production-codegen questions.
Failure Surface
When Marionette finds a bug, the long-term minimum useful failure report is:
- Failing seed.
- Simulation options.
- Failure kind.
- Trace bytes or trace path.
- Last event index.
- Reproduction command.
Better reports will add shrinking and a reduced trace. A report that only says
seed 0x1234 failed is insufficient.
If a scenario returns an error, mar.run preserves the partial trace through
the last completed event in RunReport.failed. If a scenario panics, Zig's
default panic path may abort without giving Marionette a chance to flush
anything. Marionette documents that limitation plainly and should prefer
error-returning checks for simulated failures; a future custom panic hook can
improve crash traces.
Current RunFailure captures seed, options, failure kind, event counts, owned
traces, run name, tags, typed attributes, error name when available, and check
name when a named check failed. RunFailure.writeSummary is testable and backs
RunFailure.print. A future CLI wrapper should add an exact reproduction
command once the command-line surface exists.
Exploration Strategy
Marionette will not claim to solve state-space exploration. Phase 0 and early Phase 1 use uniform seeded random exploration. That is good enough to prove the replay contract, not good enough to claim deep distributed-systems coverage.
Planned strategy layers:
- Uniform random choices first.
- Replay-visible run names, tags, and typed attributes before adding many knobs.
- Weighted fault profiles after examples reveal real needs.
- Coverage or state feedback only after there is a stable trace/event model.
- Shrinking only after failures are represented as replayable event streams.
Branch coverage alone is a weak signal for distributed simulation quality.
Event Ordering
World event indexes are global and deterministic. In Phase 0, events are
emitted directly by the single-threaded scenario. In the future multi-node
scheduler, the simulator must pick one runnable event at a time from a stable
ordering, likely (simulated_time, priority, deterministic_tiebreaker).
The tiebreaker must not depend on pointer addresses, hash map iteration, or OS scheduling. A scheduler that cannot explain its next-event choice in the trace is not deterministic enough.
Multi-Node Authority Shape
The current Phase 2 shape is a per-node endpoint handle:
fn nodeMain(env: mar.Env, endpoint: mar.Endpoint(Message)) !void {
try env.record("node.send to={}", .{2});
try endpoint.send(2, .{ .ping = {} });
}
Each Endpoint(Message) is bound to one NodeId. Application code can send
only as that node and receives only messages addressed to that node. The shared
World remains the owner of global simulation state, but application code
should receive Env plus typed node-scoped endpoints, not World plus a loose
node id.
Rejected alternatives for now:
- Passing
*Worldplusnode_ideverywhere. This is easy internally but leaks too much simulator authority into application code. - Giving each node an independent world. This weakens global ordering and makes network partitions harder to represent correctly.
Under a partition, two endpoints differ because their node-scoped authorities
consult the world's partition state through their bound NodeId.
Invariants And Liveness
Safety invariants are required for real DST. Users need to express properties like "no two replicas disagree about committed entries" and have the simulator check them regularly.
Planned API direction:
- Register invariants with the run, world, or scenario.
- Check cheap invariants after every event.
- Allow expensive invariants every N events and on quiescence.
- Include invariant name and event index in failure reports.
Current Phase 0 support is deliberately smaller: RunOptions.checks accepts
named mar.Check functions that run after the scenario body, and mar.runCase
accepts named mar.StateCheck(State) functions that inspect structured
scenario state. This proves the failure-report shape, but it is not enough for
serious multi-event DST yet.
Liveness is harder. Marionette should eventually detect stuck systems, unmet deadlines, and lack of progress under fair scheduling assumptions. This is not in v0.1, but it is a real requirement for a serious multi-node simulator.
Testing Marionette
Marionette itself must be tested as if determinism is the product.
Required test classes:
- Same seed, same scenario, byte-identical trace.
- Different seeds eventually explore different traces.
- Tidy catches banned calls and ignores comments/string literals.
- Tidy catches simple aliases to banned call paths.
- Debug and ReleaseSafe builds both pass.
- CI should run twice-and-compare on every example.
Showcase Example
The first showcase is examples/replicated_register.zig, a tiny
VOPR-inspired cluster model with deterministic message drops, latency,
delivery ordering, same-version conflict rejection, and stateful committed
state checkers. It is useful because it makes the future scheduler, network,
and checker APIs concrete, but it is not a proof that Marionette can test real
distributed systems.
The stronger proof example should be a small replicated protocol, not only the current register showcase. A 500-line Raft, VSR, or primary-backup KV store that Marionette can break and replay would prove the library much better than toy examples.
Until that exists, Marionette is promising infrastructure, not proven DST.
Non-Goals
Marionette is not:
- A replacement for unit tests.
- A Jepsen alternative that runs real distributed binaries.
- A syscall interception platform.
- A general OS thread scheduler.
- A guarantee that arbitrary Zig dependencies are deterministic.
- A commitment to support every concurrency primitive in v0.1.
Scope control is part of correctness. It is better to be narrow and true than wide and almost deterministic.
Thread-Safety
World is not thread-safe. A single World must be driven by one OS thread at
a time. Running two independent simulations concurrently in the same process is
fine if each thread owns a different World and they do not share simulated
state. Cross-world coordination is outside Marionette's determinism contract.
runCase Walkthrough
mar.runCase(.{ .seed = 0x1234, .init = init, .scenario = scenario, ... })
chronology:
- Freeze the seed, start time, tick size, checks, and trace settings.
- Construct one
World. - Create exactly one clock authority and one PRNG authority inside the world.
- Invoke the user's initializer with that world to build fresh scenario state.
- Invoke the user's scenario with that state.
- On every event, pick simulator decisions from the world's PRNG.
- Route all time movement through the world's clock and simulator controls.
- Record stable event data into the trace.
- If the scenario succeeds, run configured checks in order.
- Deinitialize state if a
.deinithook was provided. - Stop on success, scenario error, check error, or deinit error.
- Preserve a partial trace if the scenario, a check, or deinit returned an error.
- If the first run passed, rerun the same scenario with the same seed.
- Compare byte-identical traces.
- Return
RunReport.passedwith one owned trace, orRunReport.failedwith seed, options, event counts, failure kind, traces, error name when available, and check name when a check failed.
mar.run follows the same replay/report discipline for world-only scenarios.
The dangerous spots are scheduler choice, time advancement, raw randomness, unordered state dumps, and host APIs. Those must stay under simulator control.