Hardened · validated white paper · the proof

Turning AI reasoning into auditable code, and getting more out of AI than the cloud.

Cosmo is a local-first AI companion wrapped in a self-hardening build system. AI workers coordinate through plain handoff files, write receipts to local cited memory, and stay behind approval gates. On your machine. With your keys. Built to be inspected.

claim 1: reasoning → hard code
The model proposes; code decides.A deterministic, default-deny router. No model picks the route, nothing self-promotes. Every claim leaves a cited receipt.
claim 2: more from AI, locally
A cloud-grade workflow you own.Multi-agent orchestration plus persistent cited memory, private and offline-capable, for the cost of the hardware you already have.
the system, liveread-only observer · 30s heartbeat
agentsAI workers
CNSfile-handoff board
memoryLOOM + RAG
isolationOracle VM
productsCosmo + tools
0cited observations in local memory
everyanswer carries a source citation
$0cloud cost · your machine · your keys
zeroauto-actions; deterministic gate, human-reviewed
30slive watcher · 10 capture lanes
12/12integrity locks pass · SHA-verified
query "single-writer per tree CLAIM HANDBACK ledger board" → hit #4 (RRF 0.0133)
→ a real handoff-board row, cited to its exact session log line. the loop is real, end to end.

abstract

A durable spine you own, and a brain you can turn off.

The default "AI companion" is a thin client over someone else's servers. Your prompts, files and context leave your machine, and the "assistant" lives in an account you don't control. Cosmo's answer is omission and locality: a stable, deterministic spine that captures, stores, searches and renders your working life and runs with zero AI present, plus a swappable local model that can be absent, down, or replaced without the spine changing. The result is a companion you can see and keep, plus a build system that proves its own releases.

thesis 1

Turning AI reasoning into auditable hard code

The router is deterministic. No model decides.

Every reflection the model produces re-enters a pure, default-deny gate whose only routes are review-queue or deny. There is no "promote," "apply," or "execute" route at all, and the auto-promote flag is hard-coded off and re-asserted as defense-in-depth. The intelligence is advisory; the spine is authoritative.

Coordination is a file protocol, not model whim.

AI workers coordinate through a file-handoff board, a CLAIM/HANDBACK ledger that enforces single-writer-per-tree as a hard rule. The team's nervous system is plain, inspectable, version-controlled text.

Outputs check themselves.

A release-truth gate binds source → installer by SHA-256; a lock validator recomputes and compares integrity hashes (12/12 pass, live); a test suite runs green. Verification is executed, not asserted, and the work self-reports its own mistakes when it finds them.

thesis 2

More out of AI than the cloud, privately and affordably

Affordable + private by omission

$0 cloud cost; runs on your machine with your keys. The free edition ships with no account, no AI key, no login, no payment in setup. The local model is loopback-only and optional; the whole spine works with it unplugged.

More capability, not less

Multiple AI tools coordinated through the CNS plus persistent, cited memory (RAG). A workflow a single cloud chat tab can't give you, all local, isolated in a VM, packaged as a simple installer with real practical value.

architecture

Stable spine vs swappable intelligence

The spine is the product; the model is a plug. Capture, storage, cited retrieval, and the deterministic gate all run without any AI in the loop.

Stable spine vs swappable intelligence: a 5-step spine (observation contract, append-only store, hybrid retrieval, default-deny gate, read-only face) that runs with zero AI, plus a detachable LLM reasoner.
The spine runs with the model unplugged; the LLM is a loopback-only, read-only plug that can only feed a human-reviewed queue.

the memory · RAG

Local, cited memory. Every answer has a receipt.

Sources are captured read-only, redacted, and appended to a local log that is never rewritten; an index fuses dense vector search with keyword/BM25 by reciprocal-rank fusion. Every result carries a citation back to the exact source line.

RAG pipeline: source to collector to redact to append-only store to index (LanceDB + embeddings) to query (vector + BM25 fused by RRF) to a cited result.
Capture → redact → append-only store → index → hybrid retrieval → cited result. Citation is structurally mandatory, not optional.

the handoff · CNS

The handoff mechanic, the team's memory, cited

AI workers pass a CLAIM/HANDBACK board (single-writer per tree); the board is observed by the local memory and recalled later with a citation. It works with no Cosmo in the loop; the spine is independent of the product it powers.

The handoff mechanic: AI workers pass a CLAIM/HANDBACK board, observed by LOOM, then recalled later with a citation, no Cosmo required.
CLAIM → work → HANDBACK → observed → retrievable (cited). Verified: a real board row recalled at an exact session-log line.

what's proven · what isn't

Honest by construction

An independent pass re-ran every load-bearing claim live against disk. Here is the result, including what is not yet proven, because a system that hides its gaps isn't hardened.

Proven (reproduced live)

  • The citation chain: a board row recalled at its exact source line, byte-for-byte.
  • The spine is independent of Cosmo, verified in code (one-way coupling).
  • Integrity locks 12/12 pass; the validator truly recomputes SHA-256.
  • The test suite runs green now; the gate is default-deny with auto-promote off.
  • 26,152 cited observations; watcher live on a 30-second heartbeat.

Not yet proven (next)

  • Runtime network silence: needs a packet/netstat capture, not a self-reported flag.
  • Retrieval precision: relevance is qualitative; no labeled precision@k yet.
  • The reflection gate's runtime safety is verified statically, not re-run live.
  • Today's board rows aren't indexed yet; the loop is proven on prior-day data.