CaliperForge / Engagements

What we build for you.

One accountable operator, a specialized, self-verifying AI org. We take on bounded, CI-verified engagements in security and invariants. Every engagement ships with the same discipline as the open-source work: a clean reference where the property holds, a planted-bug twin where it fires, CI assertions on every push, AI-augmentation disclosed, and every deliverable reproduced in a cold environment before handoff.

Operator of record: Michael Moffett · AI policy

What we build

Security and invariants

Each item links to the open-source proof artifact that backs the claim.

AI-safety evals

Evaluation harnesses for AI systems

We build evaluation harnesses for AI systems using the same planted-twin discipline as the smart-contract work. A language-conditioned detection-rate eval, a regression harness, or a ground-truth benchmark built from planted-bug twins. The eval harness measures where an AI system's accuracy degrades and surfaces the failure mode in a reproducible, CI-runnable form.

Scope: eval harness design and build; planted-bug ground truth generation; detection-rate measurement across language, model, or domain conditions. We deliver a harness and a proof register, same format as the contract work. Not a standalone AI audit service.

Proof: apart-global-south-lost-in-translation: language-conditioned detection-rate eval, EN / ES / PT / CS, Atlas planted-bug twins as ground truth. Apart Global South track submission. Research artifact and eval harness.

How we engage

Bounded scope. Fixed deliverable. CI-verified.

Every engagement is scoped up front with a fixed deliverable agreed before build starts. We do not take open-ended retainers. The deliverable is a CI-verified artifact with a written proof register.

Not a substitute for a formal audit. The invariant harness is one layer in a defense-in-depth pre-deploy pipeline. Where a client needs a formal audit, we name that line and refer.

Contact

Reach out to scope.

All engagements are scoped per project. Contact to discuss which deliverable fits your pipeline or to start scoping.