Oracle · the randomness paradox

One `verify` block. Four tools. Two verdicts.

A 30-line example: aver verify passes, aver verify --hostile fails, and both proof backends reject it on identical source — tracing the path from sampled effects through adversarial profiles to universal proof.

The block doesn't change between runs. Only the question being asked changes: "does this hold for the stub I wrote?" versus "does this hold under adversarial profiles of the effects involved?" versus "does this hold for every implementation of this signature?". The first is sample testing. The second is hostile-mode boundary checking. The third is universal proof. Aver lets you ask all three from one source — and tells you exactly which one disagrees.

Three roles, one frame: given is the world you chose — the stub or value list the law was demonstrated in. Hostile substitutes other worlds — frozen clocks, empty disks, rolls stuck at the bound — and asks "what if your world was wrong?". when is the filter that declares which worlds this law assumes (when clock(root, 1) > clock(root, 0) = "monotonic clocks only"); hostile honours it and skips profiles that violate the assumption.

The code

A function samples two random floats and returns whether they were different. The verify block declares that, under a stub distinctStub returning n + 1, the function holds => true.

fn distinctStub(path: BranchPath, n: Int) -> Float
    ? "Stub returning n+1 as a Float — different value per call counter."
    Float.fromInt(n) + 1.0

fn twoFloatsDistinct() -> Bool
    ? "Sample two random floats; return whether they were different."
    ! [Random.float]
    a = Random.float()
    b = Random.float()
    a != b

verify twoFloatsDistinct law alwaysDistinct
    given rnd: Random.float = [distinctStub]
    twoFloatsDistinct() => true

Run it four ways

Same file. Four commands. The first runs the verify block as a sample under the stub. The second multiplies it through adversarial profiles of Random.float — midrange (always 0.5), always_zero, always_one — and asks "does the law still hold?". The other two export the same block as a universal theorem and ask a real proof backend to discharge it.

$ aver verify randomness_paradox.av

PASSES

Verify: randomness_paradox.av
  ✓ verify distinctStub        2/2
  ✓ verify twoFloatsDistinct   1/1

Summary: 3/3 cases passed

$ aver verify --hostile randomness_paradox.av

FAILS

Verify: randomness_paradox.av
  ✓ verify distinctStub        2/2
  ✗ verify twoFloatsDistinct
       1/4 (1/1 declared, 0/3 hostile)

fail[verify-hostile-mismatch]:
  origin: hostile effect profile:
    Random.float/midrange
  expected: true   actual: false

Summary: 3/6 passed | 3 failed

$ aver proof --backend lean && lake build

REJECTS

✖ Building RandomnessParadox
error: unsolved goals
  rnd : RandomFloatInUnit
  ⊢ (rnd.val BranchPath.Root 0
     != rnd.val BranchPath.Root 1)
     = true

error: build failed

$ aver proof --backend dafny && dafny verify

REJECTS

randomness_paradox.dfy(262,0):
Error: a postcondition could
       not be proved on this
       return path
  ensures twoFloatsDistinct(
    BranchPath_Root, rnd
  ) == true

finished with 11 verified,
1 error

How the three readings work

aver verify evaluates the law as a finite sample. The cartesian product of given domains is enumerated and run against the law's RHS. With distinctStub returning n + 1, the two calls produce 1.0 and 2.0; 1.0 != 2.0 holds; the sample case passes. This is the — familiar — testing path: assert against a chosen implementation, run, verify equality.

aver verify --hostile keeps the same machinery but multiplies the law through every adversarial profile Aver ships for the classified effects involved. Random.float gets midrange (always 0.5), always_zero, always_one. Each profile breaks the — implicit — assumption that the two calls return distinct values: with midrange, both calls return 0.5; 0.5 != 0.5 is false; the law fails. Hostile mode catches the bug your stub hid before you ever reach formal proof. Add a when clause to declare the assumption explicitly (when rnd(root, 0) != rnd(root, 1)) and hostile skips the profiles that violate it — the law is exercised only where its assumption holds.

aver proof takes the same block and lifts it. Every classified effect call (here Random.float) becomes an explicit function parameter, typed as the bounded subtype RandomFloatInUnit — the runtime invariant 0.0 ≤ x ≤ 1.0 is structurally encoded, no axiom required. The exported theorem says the law's RHS holds for every possible such function in the unit interval — not just the stub you supplied, not just the adversarial profiles, but every function. That's a strictly stronger claim. Lean and Dafny independently reject the export for this reason.

What conventional Z3 verification doesn't reach

Most contract-verification systems built on Z3 prove pre/post conditions on pure functions, then stop at the effect boundary. Random results are unconstrained for the solver — there are no useful axioms about them beyond range. Contracts that depend on specific random values fall to runtime checking, not static rejection. Same shape for HTTP responses, disk reads, time, network: the SMT model has no symbolic handle on them, so the proof obligation evaporates.

Aver's Oracle export does the extra step. Classified effects — Random, Http, Disk read-side, Tcp.send, Time.now, Console.readLine, non-modal Terminal, Args, Env read-side — are lifted into the proof artifact as explicit function parameters with structured call addressing (BranchPath + per-branch counter). The exported theorem quantifies over them. Universal claims about effectful code become provable, and rejectable when they're false. The randomness paradox is what that gap looks like, in 30 lines.

Run it yourself

cargo install aver-lang
git clone https://github.com/jasisz/aver
cd aver

# 1. Sample passes
aver verify examples/formal/randomness_paradox.av
  ✓ 3/3 cases passed

# 2. Hostile fails — adversarial profiles break the implicit assumption
aver verify --hostile examples/formal/randomness_paradox.av
  ✖ 3/6 passed | 3 failed (Random.float midrange / always_zero / always_one)

# 3. Lean rejects
aver proof examples/formal/randomness_paradox.av --backend lean --output out
cd out && lake build
  ✖ unsolved goals

# 4. Dafny rejects
aver proof examples/formal/randomness_paradox.av --backend dafny --output out
dafny verify out/*.dfy
  ✖ postcondition could not be proved