Project

Loom

A PostgreSQL-native memory compiler for AI workflows. Evidence-grounded memory, strict scoping, shallow graph, inspectable context assembly.

"Weaving threads of knowledge into fabric."

Every LLM starts with amnesia

Claude Code doesn't know what you discussed in ChatGPT. Copilot doesn't know the architecture decisions you made in Claude. You re-explain the same context dozens of times per week. Simple top-k vector retrieval doesn't fix this. It returns fragments without structure, ranking, or provenance.

A memory layer that follows you across tools

Loom watches your work across LLM tools, builds a knowledge graph from it, and compiles the right context into any AI tool at query time. It replaces "paste your context" with an always-on memory layer spanning Claude Code, Codex CLI, ChatGPT, GitHub Copilot, and anything that speaks MCP.

You work normally Loom learns You ask any LLM Loom compiles + injects

Two pipelines, one database

PostgreSQL 16 is the single system of record. No external vector store. No graph database. pgvector handles embeddings, recursive CTEs handle graph traversal, pgAudit handles compliance. Two strictly separated pipelines share the database but never share runtime.

Online pipeline

  1. Intent classification (primary + secondary class)
  2. Namespace resolution
  3. Parallel retrieval profiles (1-3, merged)
  4. Memory weight modifiers per task class
  5. Rank on 4 dimensions: relevance, recency, stability, provenance
  6. Compile context package
  7. Full audit trace

Offline pipeline

  1. Ingest episode + SHA-256 dedup
  2. Extract entities (structured prompt)
  3. Three-pass entity resolution
  4. Extract facts against predicate registry
  5. Link facts to source episodes
  6. Resolve supersession
  7. Compute derived ranking state

Three types, strict authority

Facts are never more authoritative than their source episodes. Procedures are candidate patterns until promoted. Two tiers in MVP: Hot (always injected, configurable budget per namespace) and Warm (retrieved per-query). New facts always start warm. Namespace isolation is absolute.

Episodic

Raw interaction records. Immutable evidence. Strongest audit anchor. Primary for debug and compliance tasks.

Semantic

Extracted facts and entity relationships. Derived from episodes. Revisable. Primary for architecture tasks.

Procedural

Inferred behavioral patterns. Most provisional. Requires 3+ episodes across 7+ days and 0.8+ confidence to promote.

Authority: Episodes > Facts > Procedures

Rust engine, local inference, zero cloud dependency

Engine Rust

tokio, axum, sqlx. Compile-time SQL checking. ~20MB Docker image.

Dashboard React + Vite

TypeScript. Static files served by Caddy.

Database PostgreSQL 16

pgvector, pgAudit. Single system of record.

LLM inference Ollama (local)

Gemma 4 26B MoE extraction. Gemma 4 E4B classification. nomic-embed-text embeddings.

Bootstrap Python

Run-once scripts. Parses Claude.ai, ChatGPT, Codex CLI exports.

Deployment Docker Compose

Caddy reverse proxy + TLS. Five containers total.

Three tools. Not five.

loom_think

Compiles a context package for a query. Fires automatically before complex tasks. The primary integration point.

loom_learn

Ingests a new episode from any source. Returns accepted, duplicate, or queued status.

loom_recall

Returns raw search results without compilation. For when you want to browse, not compile.

Primary integration: Claude Code via native MCP. Secondary: manual REST ingestion. Codex CLI follows after validation.

Fail either gate: simplify or kill

Week 4: Extraction

50-episode sample, human-annotated. Blocks all further work if thresholds aren't met.

Entity precision ≥ 0.80
Fact precision ≥ 0.75
Predicate consistency ≥ 0.85

Week 8: Compilation

Compiled context (C) must beat raw retrieval (B) across 10 benchmark tasks.

Precision improvement ≥ 15%
Token reduction ≥ 30%
Zero task success regression

Full observability into the memory system

Pipeline health monitoring, compilation trace viewer with per-candidate score breakdowns, knowledge graph explorer, entity conflict review queue, predicate candidate review with pack browsing, retrieval quality metrics (precision, latency percentiles, classification confidence), extraction quality comparison across model versions, and A/B/C benchmark comparison views.

Core thesis: a memory compiler with explicit ranking, graph traversal, and audit logging produces better context packages than simple top-k vector retrieval. The MVP exists to prove or disprove that thesis.

12-week build · PostgreSQL-native · Local inference · MCP-first