Mnemonic
Open Source · MCP Server · Claude Desktop

AI Agents Forget.
Mnemonic Remembers.

Persistent memory for Claude Desktop — semantic search over your documents and conversations, grounded answers with citations.

<10msingestion latency
5MCP resources
10MCP tools
MITlicense

// QUICK START

Memory in 60 seconds.

01Install
python 3.8+ · pip
pip install personal-brain-mcp

# Installs:
# - FastMCP server
# - LangChain + Pinecone client
# - Google Generative AI SDK
# - PyPDF2, SpeechRecognition
02Configure
claude desktop · json
// claude_desktop_config.json
// ~/Library/Application Support/Claude/
{
  "mcpServers": {
    "personal-brain": {
      "command": "personal-brain-mcp",
      "args": []
    }
  }
}
03Use
in any conversation
You: "What did we decide about
      the DB schema last month?"

Claude: [searches 1,247 documents]
"On Feb 12, you decided to use
 PostgreSQL. Relevant context:
 [1] schema-notes.md · line 34
 [2] meeting-2024-02-12.txt"

// PROBLEM

Your AI has amnesia.

Current memory systems share three fundamental failure modes.

01STALE RECALL

The Goldfish

Vector databases return stale facts alongside current ones with no temporal ordering. Every fact looks equally valid.

query: "where do I live?"
→ Austin (score: 0.87, t=2024-01)
→ NYC    (score: 0.86, t=2024-06)
→ Austin (score: 0.85, t=2024-09)
# no way to know which is current
02LATENCY SPIKE

The Bottleneck

Synchronous graph extraction blocks the critical path. Every message waits for entity resolution before the agent responds.

entity_extraction:  847ms ← blocking
llm_call:           1200ms
total_wait:         2047ms / message
p99_overhead:       +2.3s per turn
03SILENT FAILURE

The Forgetter

Relying on agents to manage their own memory never works reliably. Without external enforcement, it simply doesn't happen.

memory.save() calls / session: 0
agent_initiated_saves (30d):  0/183
auto_recall_attempts:         0/183
effective_memory_rate:        0.0%

// ARCHITECTURE

How it works.

Dual-process architecture inspired by human memory consolidation. Fast append at runtime, intelligent consolidation offline.

01no LLM call

Event Stream

<10ms · Append-Only

Every message, upload, and thought is instantly written as an immutable event. Zero blocking — the agent never waits.

02runs offline

Sleep Consolidator

Async · Background

Like human sleep, consolidation runs offline — extracting entities, resolving contradictions, updating the temporal graph.

03valid_until edges

Temporal Graph

Bitemporal · Episodic

Every graph edge carries time bounds. Old facts aren't deleted — they're marked with valid_until, preserving full history.

040ms retrieval

Compiled Memory

System Prompt · Live

The graph compiles to a concise markdown summary injected into the system prompt — always current, no retrieval step.

// COMPARISON

How we compare.

The only memory system with async consolidation and temporal contradiction resolution.

SystemIngestion LatencyContradiction ResolutionAsync ConsolidationLocal-FirstOpen Source
Mnemonicours<10ms
mem0<10ms
SuperMemory~200ms
Zep / Graphiti800ms+
MemGPT / Letta~500ms

* latency at p50 · contradiction resolution: 5-fact test set · async = non-blocking ingestion

// GET STARTED

Memory that improves while you sleep.

Watch contradictions resolve in real-time, or build your own knowledge brain.

pip install personal-brain-mcp