cascade-annote
Videos
Description
## The Problem
Standard AI annotation pipelines pipe text into a single model and
trust the output blindly. This approach is fragile — it provides no
evidence, no calibration, and no recovery path when the model is
uncertain. Labels produced this way are unverifiable and untraceable.
## The Solution — CascadeAnnote
CascadeAnnote treats every label as a falsifiable claim backed by
evidence. It runs every annotation through a 4-layer cascade:
L1 — Dynamic ICL Retrieval
TF-IDF + cosine similarity retrieves the most relevant labeled
exemplars from the corpus in real time.
L2 — Chain-of-Thought Reasoning
A structured 5-step CoT prompt builds an evidence-backed reasoning
trace before committing to a label.
L3 — Self-Consistency Vote
5 independent inference runs at varied temperatures produce a
majority-voted label with a calibrated confidence score.
L4 — Adaptive Fallback
If confidence falls below threshold, the system automatically widens
the evidence window and re-votes at cooler temperatures.
## 0G Infrastructure Integration
CascadeAnnote is natively built on 0G's modular stack:
- 0G Storage — every annotation is SHA-256 hashed and uploaded
to the 0G indexer, producing a verifiable rootHash + txHash receipt
- 0G Compute — Layer 3 inference can be routed through 0G Compute
for fully on-network verifiable inference
- 0G Chain — a stable did:0g agent identity is derived per
deployment; receipts are anchored under this DID
## Tech Stack
- Next.js 15 + TypeScript + Tailwind CSS
- Pure TypeScript engine — no GPU, no heavy ML dependencies
- Provider-agnostic: Local / OpenAI / 0G Compute
- Single Vercel deployment — fully serverless
## Links
- Live Demo: https://cascade-annote.vercel.app
Progress During Hackathon
<p>## What We Built During the Hackathon</p><p><strong>Week 1 — Architecture & Core Engine</strong></p><p>Designed the 4-layer cascade pipeline architecture. Implemented L1 </p><p>retrieval engine using TF-IDF with unigram + bigram indexing and </p><p>cosine similarity scoring. Built the seed corpus with 60 labeled </p><p>examples across 4 label families (sentiment, topic, intent, toxicity).</p><p><strong>Week 2 — Inference & Voting</strong></p><p>Built the chain-of-thought prompt builder (L2) with a 7-strategy </p><p>label extractor. Implemented the self-consistency voter (L3) with 5 </p><p>inference runs at temperatures [0.3, 0.7]. Integrated local ICL </p><p>classifier, OpenAI, and 0G Compute as provider-agnostic backends.</p><p><strong>Week 3 — 0G Integration & Verifiability</strong></p><p>Integrated 0G Storage adapter — every annotation is SHA-256 hashed </p><p>and uploaded with a verifiable rootHash + txHash receipt. Implemented </p><p>0G Chain agent identity (did:0g DID) and 0G Compute inference routing.</p><p>Built the adaptive fallback layer (L4) with confidence thresholding.</p><p><strong>Week 4 — Frontend & Deployment</strong></p><p>Built 9 frontend pages (annotate studio, pipeline explorer, dataset </p><p>uploader, storage receipt explorer, agent identity, results dashboard).</p><p>Deployed as a single Next.js 15 serverless app on Vercel.</p><p>## Current Status</p><p>✅ Fully deployed and live at <a href="https://cascade-annote.vercel.app">https://cascade-annote.vercel.app</a></p><p>✅ All 4 pipeline layers operational</p><p>✅ 0G Storage receipts working</p><p>✅ Batch annotation API (up to 50 texts per call)</p><p>✅ CSV corpus upload and activation</p>
Tech Stack
Fundraising Status
<p>Not yet fundraised. CascadeAnnote is currently bootstrapped and </p><p>self-funded as an independent open-source project. We are open to </p><p>grants, ecosystem funding, and strategic partnerships — particularly </p><p>within the 0G ecosystem — to accelerate development of the active </p><p>learning loop, multi-agent voting, and sealed inference features on </p><p>the roadmap.</p>