hackquest logo

Caesura

A Bittensor subnet that makes it financially rewarding to find where frontier AI models structurally fail. Proof of Blind Spot: the first open, adversarially-generated capability gap observatory.

Videos

Description

Caesura is a Bittensor subnet for adversarial capability cartography — a living, open map of what frontier AI models cannot do.

Every major AI benchmark has the same lifecycle: it gets published, labs optimize against it, scores climb, and the benchmark becomes useless. MNIST, ImageNet, GLUE, MMLU, HumanEval — each lasted 18-36 months before becoming performance theater. The organizations with resources to build better benchmarks have a direct conflict of interest. OpenAI will not publish a benchmark that exposes where GPT-5 fails.

Caesura inverts the incentive direction. Where every other subnet optimizes AI to pass tests, Caesura builds a financially incentivized market for producing tests that AI cannot pass — specifically, challenges that reveal transferable architectural blind spots across model families.

MECHANISM: Proof of Blind Spot (PoBS)

Miners submit adversarial challenge-response pairs. Every submission passes through four sequential validation gates:

G1 — Failure: challenge defeats ≥3 of 5 frontier evaluation models

G2 — Novelty: semantic similarity <0.82 against the full corpus index

G3 — Non-Triviality: classifier score >0.75 (filters surface exploits, prompt injections, unicode attacks)

G4 — Transferability: tagged by capability dimension and transferability class

Gate 3 is the Proof of Intelligence gate. Passing all four requires genuine meta-intelligence about how frontier models structurally work.

EMISSION DESIGN

Power-law distribution: top 1% of epoch submissions receive 25% of emissions. One exceptional discovery earns more than a hundred mediocre ones. The network selects for researchers, not factories.

THE COMPOUNDING MOAT

Valid submissions build an immutable public corpus — the Caesura Capability Gap Observatory. The corpus compounds non-linearly: larger corpus raises the novelty bar, which improves submission quality, which makes the corpus more valuable. The copy starts with an empty corpus. The value is not in the mechanism — it is in what the mechanism builds over time.

Progress During Hackathon

<p>Completed full mechanism design for the Proof of Blind Spot (PoBS) validation pipeline. Specified the four-gate evaluation architecture including failure verification, semantic novelty checking, non-triviality classification, and transferability tagging. Designed the power-law TAO emission structure and validator staking/slashing model. Produced a 17-page subnet design proposal, 10-slide pitch deck, and explanation video. The Caesura Capability Taxonomy (ontology of AI capability dimensions) is documented and ready for implementation.</p>

Tech Stack

Python
Node
Rust

Fundraising Status

<p>Not yet fundraising. Seeking subnet slot and early validator/miner community.</p>

Team LeaderLLydia Solomon
Sector
AIInfraDAO

Builders Also Viewed