Mnemonic

// THREE-TIER MEMORY

How memory is structured.

Inspired by human memory consolidation — fast writes at runtime, intelligent promotion offline. Three tiers, one coherent brain.

01● ACTIVE

Episodic

Short-term · Append-only

Every message, upload, and thought is instantly written as an immutable event. Zero blocking — the agent never waits for acknowledgement.

write latency<10ms
write pathReal-time, no LLM call
read pathRecent N events
storageAppend-only event log
02○ PLANNED

Semantic

Long-term · Graph + Vectors

Facts, entities, and relationships extracted from episodic events. Graph edges carry valid_until timestamps so contradictions are never silently deleted.

write latencyasync
write pathConsolidation pipeline
read pathSemantic search
storageKnowledge graph + vectors
03○ PLANNED

Identity

Core self · Compiled profile

Persistent user profile compiled from semantic memory. Injected into every system prompt as ground truth — no retrieval step, zero latency.

write latencyasync
write pathConsolidation → merge
read pathDirect lookup
storageCompiled profile JSON

// FIVE LAYERS

The full stack.

Each layer has a single responsibility. Applications talk to the API gateway. The gateway talks to the memory engine. Storage is pluggable.

L5
Applications
Web DashboardClaude Desktop MCPSDK / API Clients
partial
L4
API Gateway
FastAPI · Auth · Rate Limiting
partial
L3
Memory Engine
Episodic MemorySemantic MemoryIdentity Memory
partial
L2
Processing
Document ParserEmbedding PipelineConsolidation EngineContradiction Resolver
planned
L1
Storage
Pinecone / QdrantPostgreSQLNeo4jRedis
partial

// MCP PROTOCOL

7 core operations.

A minimal, well-defined interface for agents to interact with memory. Each operation maps to exactly one memory tier.

Operation
Description
Latency
Status
memory.append
Write event to episodic memory
<10ms
live
memory.search
Semantic search across all tiers
~50ms
live
memory.consolidate
Trigger offline consolidation
async
live
memory.recall
Retrieve specific memory by ID
<5ms
planned
memory.forget
Mark memory entry for removal
<10ms
planned
memory.compile
Rebuild identity summary from graph
~200ms
planned
memory.status
Health check and usage stats
<5ms
live

* "live" = implemented in current MCP server · "planned" = Phase 2

// PIPELINE

The Full Pipeline

From chat bubble to updated ground truth — step by step.

Chat Input

User message arrives

A thought, question, or file upload enters the system.

System 1 — Fast Path

<10ms, no LLM call

Event is appended to the immutable stream instantly. No blocking, no extraction.

Events Accumulate

Append-only log

Events pile up in the stream, each with a timestamp and type marker.

System 2 Awakens

Background, async

The sleep consolidator activates — running offline, never blocking the agent.

Entity Extraction

LLM-powered parsing

Entities and relationships are extracted from unconsolidated events into graph edges.

Contradiction Detection

Temporal resolution

New facts are compared against existing edges. Contradictions are resolved with valid_until timestamps.

Memory Compilation

compiled_memory.md

The graph compiles into a concise markdown summary — active facts and superseded history.

Loop Complete

System prompt updated

Compiled memory is injected into the agent's system prompt. Ground truth is always current.

// BENCHMARKS

Performance comparison.

Measured across ingestion latency, contradiction resolution, and retrieval precision.

* Simulated benchmark data — real evaluation in progress

Ingestion Latency

Lower is better (ms)

Contradiction Resolution

Out of 3 contradictions

Retrieval Precision

Over 50 sessions

// ROADMAP

What's being built.

Four phases from working prototype to production memory platform. Phase 0 is complete — phases 1-3 are sequenced for minimum viable wiring first, intelligence second.

Phase 0● IN PROGRESS

Foundation

Current sprint

6/7
  • Clean repo structure (archive old files)
  • Technical architecture document
  • Next.js frontend with component library
  • 3-panel brain dashboard
  • Sharp Editorial design system
  • mnemonic-autoimprove RAG optimizer
  • Unit test framework setup
Phase 1○ NEXT

Wire + Auth

Next sprint

0/7
  • PostgreSQL schema + Alembic migrations
  • JWT authentication middleware
  • Connect frontend to backend (/chat/enhanced, /upsert)
  • CORS lockdown + rate limiting via Redis
  • Error handling + retry logic in frontend
  • CI/CD pipeline (GitHub Actions)
  • Docker Compose for local dev
Phase 2○ FUTURE

Memory Engine

Three-tier implementation

0/6
  • Three-tier memory model (Episodic → Semantic → Identity)
  • Async consolidation pipeline
  • Contradiction detection & temporal resolution
  • Neo4j knowledge graph integration
  • MCP protocol v2 (7 operations)
  • WebSocket support for real-time updates
Phase 3○ FUTURE

Platform

Multi-user & analytics

0/6
  • Multi-user + team memory
  • Analytics dashboard (ClickHouse)
  • Python + TypeScript SDK
  • Plugin system
  • OAuth / SSO
  • Billing integration (open core model)

// GET INVOLVED

Open source memory infrastructure.

Mnemonic is built in the open. Star the repo, open an issue, or contribute a consolidation strategy.

pip install personal-brain-mcp