// THREE-TIER MEMORY
How memory is structured.
Inspired by human memory consolidation — fast writes at runtime, intelligent promotion offline. Three tiers, one coherent brain.
Episodic
Short-term · Append-onlyEvery message, upload, and thought is instantly written as an immutable event. Zero blocking — the agent never waits for acknowledgement.
Semantic
Long-term · Graph + VectorsFacts, entities, and relationships extracted from episodic events. Graph edges carry valid_until timestamps so contradictions are never silently deleted.
Identity
Core self · Compiled profilePersistent user profile compiled from semantic memory. Injected into every system prompt as ground truth — no retrieval step, zero latency.
// FIVE LAYERS
The full stack.
Each layer has a single responsibility. Applications talk to the API gateway. The gateway talks to the memory engine. Storage is pluggable.
// MCP PROTOCOL
7 core operations.
A minimal, well-defined interface for agents to interact with memory. Each operation maps to exactly one memory tier.
* "live" = implemented in current MCP server · "planned" = Phase 2
// PIPELINE
The Full Pipeline
From chat bubble to updated ground truth — step by step.
Chat Input
User message arrives
A thought, question, or file upload enters the system.
System 1 — Fast Path
<10ms, no LLM call
Event is appended to the immutable stream instantly. No blocking, no extraction.
Events Accumulate
Append-only log
Events pile up in the stream, each with a timestamp and type marker.
System 2 Awakens
Background, async
The sleep consolidator activates — running offline, never blocking the agent.
Entity Extraction
LLM-powered parsing
Entities and relationships are extracted from unconsolidated events into graph edges.
Contradiction Detection
Temporal resolution
New facts are compared against existing edges. Contradictions are resolved with valid_until timestamps.
Memory Compilation
compiled_memory.md
The graph compiles into a concise markdown summary — active facts and superseded history.
Loop Complete
System prompt updated
Compiled memory is injected into the agent's system prompt. Ground truth is always current.
// BENCHMARKS
Performance comparison.
Measured across ingestion latency, contradiction resolution, and retrieval precision.
* Simulated benchmark data — real evaluation in progress
Ingestion Latency
Lower is better (ms)
Contradiction Resolution
Out of 3 contradictions
Retrieval Precision
Over 50 sessions
// ROADMAP
What's being built.
Four phases from working prototype to production memory platform. Phase 0 is complete — phases 1-3 are sequenced for minimum viable wiring first, intelligence second.
Foundation
Current sprint
- Clean repo structure (archive old files)
- Technical architecture document
- Next.js frontend with component library
- 3-panel brain dashboard
- Sharp Editorial design system
- mnemonic-autoimprove RAG optimizer
- Unit test framework setup
Wire + Auth
Next sprint
- PostgreSQL schema + Alembic migrations
- JWT authentication middleware
- Connect frontend to backend (/chat/enhanced, /upsert)
- CORS lockdown + rate limiting via Redis
- Error handling + retry logic in frontend
- CI/CD pipeline (GitHub Actions)
- Docker Compose for local dev
Memory Engine
Three-tier implementation
- Three-tier memory model (Episodic → Semantic → Identity)
- Async consolidation pipeline
- Contradiction detection & temporal resolution
- Neo4j knowledge graph integration
- MCP protocol v2 (7 operations)
- WebSocket support for real-time updates
Platform
Multi-user & analytics
- Multi-user + team memory
- Analytics dashboard (ClickHouse)
- Python + TypeScript SDK
- Plugin system
- OAuth / SSO
- Billing integration (open core model)
// GET INVOLVED
Open source memory infrastructure.
Mnemonic is built in the open. Star the repo, open an issue, or contribute a consolidation strategy.
pip install personal-brain-mcp