CaliperForge

Blog · 2026-06-08 · Michael Moffett

Exploit→Invariant Atlas v0.1: six real hacks, four VMs, one CI gate.

The Exploit→Invariant Atlas is a cross-VM, defender-side CI benchmark. Take a historical DeFi exploit. Reconstruct it as a same-source twin — one copy with the defensive fix, one with the bug planted. Write an invariant property that fires on the planted twin and holds on the clean one. Wire both legs into CI. That structure, repeated across six real incidents and four toolchains, is what v0.1 ships.

The six: zkLend (~$10M, Cairo/Starknet), Cetus (~$223M, Move/Sui), Cashio (~$50M, Solana), Mango Markets (~$117M, Solana), Loopscale (~$5.8M, Solana), and a representative EVM access-control case (~$8.9M, Solidity/Foundry). None of these exploits were caught by CaliperForge live. The claim is narrower: for each, a runnable invariant property exists that would have failed in pre-deploy CI against the pre-exploit code.

Related work

Trace2Inv covers EVM at runtime. Nothing covers Cairo, Solana, or Move at the defender side.

DeFiHackLabs (SunWeb3Sec) is the attacker-side reference: 572 historical EVM exploits as Foundry PoCs, invaluable for reproduction and post-mortem research. Trace2Inv (Chen et al., FSE 2024, arXiv:2404.14580) is the peer-reviewed defender-side benchmark: 27 historical EVM exploits totaling over $2B, measured against runtime invariant guards — 18 of 27 blocked by a single invariant, 23 of 27 by a combination, 0.28% false-positive rate. FLAMES and PropertyGPT extend the LLM-synthesized-invariant line for Solidity.

That is the EVM canon. It is solid. It does not cover Cairo, Solana, or Move at the defender side. Formal-verification tooling for Solana is still early-stage — a prover exists, but the practice is nascent — and no public defender-side invariant benchmark exists for Cairo, Solana, or Move today. The Atlas is differentiated on three axes: (a) cross-VM first, with Cairo, Move, and Solana as the spine and EVM included as a bridge to the Trace2Inv canon; (b) pre-deploy CI gate rather than runtime guard — the threat model is “what could a team have wired in before mainnet,” not “what would have blocked the transaction at execution time”; (c) tied to runnable harnesses CaliperForge already maintains, so the benchmark and the reference implementation co-evolve rather than drifting.

In short: Trace2Inv’s defender-side rigor, applied pre-deploy in CI, across the VMs that line of work has not yet covered.

Structure

Same source. One change. Both legs in CI.

Each Atlas case ships two subdirectories: clean/ with the defensive fix in place, planted/ with the bug class reconstructed. The test file — the invariant harness — is byte-identical between the two. A diff -r clean/ planted/ shows the planted hunk as a single localized change to one source file. The property test and the build manifests do not move.

CI asserts both legs on every push:

A nightly cron re-runs the matrix daily so a toolchain update (scarb, snforge, anchor, foundry, move) surfaces as a red badge the morning it lands. Both legs have to be green to call the case done.

Case set

Six exploits. Four VMs. One invariant class each.

The case order is non-EVM first. Cairo, Move, and Solana are the spine; the EVM case is the bridge to the Trace2Inv canon, not the lead.

Case 1 — zkLend · Cairo/Starknet · Feb 2025 · ~$10M

Bug class: the lending accumulator is inflated on an empty market via a flash-loan “donation.” Subsequent deposits round-truncate to zero shares while still crediting the underlying; the attacker withdraws on a single share to drain the pool.

Two invariant properties cover the class. Property A — accumulator monotonicity / empty-market deposit guard — asserts that after every transition, total shares are either zero or above the minimum-init threshold. Property B — round-trip non-amplification — asserts that for every user, withdrawn never exceeds deposited.

The planted twin’s deposit(1) on an empty market produces total_shares = 1, which is below the minimum threshold. Property A fires; the attack sequence is caught before the drain step. The clean twin’s empty-market guard rejects deposit(1) at call time. Both properties hold on the clean leg. Scorecard: 4 passed, 0 failed (clean); 2 passed, 2 failed (planted), both with INVARIANT VIOLATED accumulator_empty_market_guard.

Toolchain: scarb 2.18.0, snforge 0.60.0.

Case 2 — Cetus · Move/Sui · May 2025 · ~$223M

Bug class: a flawed checked_shlw overflow check. The mask constant was 0xFFFFFFFFFFFFFFFF << 192 — roughly 2256 − 2192 — instead of the correct 1 << 192 (= 2192). n << 64 overflows u256 when n ≥ 2192, so the correct gate is n ≥ 1 << 192. The planted mask is off by roughly three binary orders of magnitude, letting any value in the wide range (2192, 0xFFFFFFFFFFFFFFFF << 192] pass. The subsequent shift truncates silently, the AMM prices liquidity orders of magnitude below its real cost, and the attacker mints huge liquidity for approximately one token.

Two properties: Property A — overflow-safety on checked_shlw — asserts the overflow flag fires exactly when n ≥ (1 << 192). Property B — liquidity ↔ token conservation — asserts the pool’s token balance never falls below its outstanding liquidity at the documented ratio.

The planted twin’s checked_shlw(2192 + 1) returns (0, true) on the clean version but passes through on the planted version, after which total_tokens = 1001 against total_liquidity ≈ 2192 + 1001. Property B fires immediately. Scorecard: 5 passed, 0 failed (clean); 3 passed, 2 failed (planted).

Toolchain: sui 1.73.x (Move 2024 edition).

Case 3 — Cashio · Solana/Anchor · Mar 2022 · ~$50M

Bug class: missing collateral-account validation. The mint_print path accepted a TokenAccount as collateral without asserting its mint field matched the protocol-configured LP-token mint. An attacker supplied a fake-mint token account and minted unlimited CASH.

Two properties: Property A — collateral authenticity — asserts that every successful mint_cash call passes a collateral account whose mint equals bank.collateral_mint. Property B — mint-backing conservation — asserts that total CASH minted never exceeds total validated collateral deposited.

The planted twin accepts mint_cash(fake_collateral, 1_000_000_000), incrementing total_cash_minted to 1e9 while total_collateral_validated remains 0. Property A fires (INVARIANT VIOLATED collateral_mint_authority); Property B would also have fired on the same post-step state, but Property A’s assertion runs first in test order. The clean twin’s single constraint annotation — collateral.mint == bank.collateral_mint — rejects the instruction at the Anchor account-validation layer before the supply logic runs.

Toolchain: Agave solana-cli 4.0.1, stable Rust, Anchor 1.0.1; LiteSVM 0.12.0 in-process.

Case 4 — Mango Markets · Solana/Anchor · Oct 2022 · ~$117M

Bug class: oracle price manipulation without a safety gate. The protocol’s collateral valuation read the spot oracle price directly — no freshness check, no deviation bound against a TWAP floor, no min(spot, twap) clamp. An attacker pumped MNGO on a low-liquidity venue, inflated collateral value, and borrowed against it to drain the protocol’s other books.

Two properties: Property A — borrow-power validation against the safe price — asserts that after every transition, the borrowed field never exceeds collateral × min(oracle_price, twap_price) × LTV / price_scale. Property B — oracle freshness bound — asserts that when borrowed > 0, the oracle used at authorization was within the freshness window.

The planted twin’s borrow(700) against a 20× oracle pump produces borrowed = 700 while the safe-priced maximum is 50. Property A fires. The clean twin gates the borrow at the deviation check (delta > allowed), returning OracleDeviationOutOfBound before the borrow settles.

Toolchain: stable Rust 1.75+. No SBF toolchain required — the Anchor program crate compiles on the host for property testing.

Case 5 — Loopscale · Solana/Anchor · Apr 2025 · ~$5.8M

Bug class: single-source oracle trust on RateX Principal Token collateral. The protocol priced PT collateral against a spot oracle that the attacker could move within a transaction bundle, inflating PT’s perceived value and enabling undercollateralized loans.

Two properties: Property A — collateral-valuation / oracle bound — asserts that the effective price used to value collateral on any borrow does not exceed the conservative TWAP floor. Property B — no undercollateralized borrow can settle — asserts that for every position, borrowed_value is backed by the conservative-priced collateral at the LTV cap.

The planted twin’s borrow(370) against a 5× oracle spike writes last_borrow_effective_price = 5.00 against twap_floor = 1.00. Property A fires (INVARIANT VIOLATED collateral_valuation_bound); Property B would also fire on the same state, but Property A’s check runs first and the test stops at the first violation. The clean twin takes effective_price = min(spot, twap_floor) = 1.00, caps max_borrow at 67.5, and rejects the 370-unit borrow with Undercollateralized.

Toolchain: Rust 1.79+ (stable); pure-Rust library test rail, no cargo-build-sbf required.

Case 6 — EVM access-control bridge · Solidity/Foundry · Aug 2021 representative · ~$8.9M

This case is the Atlas’s bridge to the Trace2Inv EVM canon. The bug class: a missing onlyOwner modifier on a privileged migrate(_) function. An attacker calls the unguarded setter to redirect the yield strategy to an attacker-controlled contract, then drains user deposits through the strategy’s outflow path. The named real-world representative is Punk Protocol (PeckShield + Halborn post-mortems, Aug 2021; DeFiHackLabs PoC). Trace2Inv catalogs many more access-control failures in this class.

Two properties, expressed as Foundry stateful properties: Property A — admin_only_migrate — asserts that any successful migrate(_) call was made by the owner. Property B — strategy_in_approved_set — asserts that the current yieldStrategy() is always in the test-maintained approved-strategies ledger.

The planted twin’s attacker migrate (non-owner caller) produces yieldStrategy()approvedStrategies. Both properties fire. The clean twin’s onlyOwner modifier blocks the call at the gate. The fix is a one-word diff on the function declaration.

Toolchain: Foundry stable.

Calibration

Not “we caught them.” Not a complete benchmark. Not formal verification.

Three non-claims the Atlas holds strictly.

Would have caught, not caught. None of the six protocols ran these property tests before their exploits. CaliperForge did not find any of these vulnerabilities live. The Atlas demonstrates that runnable invariant properties exist that would have failed in pre-deploy CI against the pre-exploit code. That is a narrower claim and the honest one.

Not a complete benchmark. Trace2Inv’s 27 EVM cases took a peer-reviewed academic group. v0.1 ships six cases across four VMs — this is the first defender-side cross-VM benchmark in this space, not an exhaustive one. The Atlas is positioned as an extension of the existing canon, not a replacement.

Not formal verification. Pre-deploy property tests are coverage-guided, not exhaustive. The claim is “this property catches this specific historical exploit’s class of failure under CI run,” not “this property proves the absence of the bug class.” Trace2Inv’s runtime guards and the Atlas’s pre-deploy properties are complementary threat models.

Process

AI-augmented, human-reviewed. An encodable org, one evening.

Each case follows the same process: read the public post-mortem, identify the minimal bug class, reconstruct a same-source twin, have the AI specialist propose candidate invariant predicates, review and edit the invariants, wire up the CI assertion, verify both legs pass. The specialist agents (Cairo Specialist, Move Specialist, Rust/Anchor Specialist, Solidity Specialist) draft the invariant surface and the reconstructed code against the published post-mortems; the case author accepts, rejects, or edits. The CI verdict — pass or fail — is a function of the test runner, not the model.

AI involvement is forward-disclosed in each case’s README and scorecard, and in the project-level AI_DISCLOSURE.md. The operator of record is Michael Moffett. The invariant properties and same-source twins are hand-reviewed fixtures, not autonomous AI discovery.

Repo

Apache-2.0. Run it locally. CI re-asserts on every push.

https://github.com/caliperforge/invariant-atlas

The repo structure mirrors the case set. Each cases/<n>-<vm>-<protocol>/ directory ships a clean/ and a planted/ subdirectory (full build projects — Scarb, Cargo workspace, Sui Move package, or Foundry project depending on VM), a README.md with property prose and post-mortem citations, and a pair of scorecards captured from verified local runs on 2026-06-08.

To run case 1 locally (requires scarb 2.18.0 and snforge 0.60.x or 0.61.x):

# clean leg — properties hold
cd cases/01-cairo-zklend/clean && scarb build && snforge test
# Expected: 4 passed, 0 failed.

# planted leg — properties fire
cd ../planted && scarb build && snforge test
# Expected: 2 passed, 2 failed, both with INVARIANT VIOLATED marker.

See each case’s README for the VM-specific reproduce instructions.

Apache-2.0. Attributions for the upstream canon, the affected protocols’ published post-mortems, and the toolchain authors are in NOTICE.

Operator of record: Michael Moffett — michael@caliperforge.comteam@caliperforge.com. AI assistance was used in drafting the invariant properties, same-source twin reconstructions, and this post; all content reviewed by the operator. See AI_DISCLOSURE.md in the repo and caliperforge.com/ai-disclosure for the full disclosure register.