memory-mcp
A self-organizing, persistent semantic memory layer for AI agents, exposed as an MCP (Model Context Protocol) server over HTTP. Memories are stored in PostgreSQL with pgvector, embedded via OpenAI, and retrieved through hybrid vector + keyword search with Reciprocal Rank Fusion (RRF). The system autonomously chunks, deduplicates, and categorizes content—no manual schema management required.
Architecture
AI Agent / Claude / Cursor
│ HTTP (MCP)
▼
server.py ──── FastMCP (tools.py)
│
├── config.py Environment & OpenAI client
├── llm.py Embedding + LLM calls
├── db.py PostgreSQL pool, schema, dedup logic
└── utils.py Chunking, retries, ID generation
PostgreSQL + pgvector
├── memories Chunks, embeddings, category path (ltree), metadata
├── memory_edges Graph edges (sequence_next, relates_to)
└── reference.system.primer Auto-synthesized global context
The server starts a background timer that triggers automatic System Primer regeneration whenever enough new memories accumulate (threshold: 10 changes, max age: 1 hour).
MCP Tools
| Tool | Description |
|---|---|
memorize_context | Ingest raw text. Chunks, embeds, categorizes, and deduplicates automatically. Supports optional ttl_days. |
search_memory | Hybrid vector + BM25 search with RRF. Filter by category_path (e.g. projects.hardpoint). |
initialize_context | Fast bootstrap: returns all reference.system.* memories. Call this at the start of every session. |
traverse_sequence | Walk sequence_next edges forward or backward from a memory ID. |
list_categories | Return all occupied taxonomy paths with item counts. |
delete_memory | Hard-delete a memory by ID (cascades edges). |
prune_history | Batch-delete superseded (merged) memories older than N days. |
recategorize_memory | Move a single memory to a new taxonomy path. |
bulk_move_category | Move an entire taxonomy branch (e.g. old prefix → new prefix). |
synthesize_system_primer | Force a full scan → LLM summary stored at reference.system.primer. Use sparingly. |
Taxonomy
Memories are organized into a hierarchical dot-path taxonomy using PostgreSQL ltree:
user.profile.personal.focus
user.health_profile.medical
projects.<name>.architecture.stack
organizations.<name>.business.subscription
concepts.ai_interaction.behavior.code_execution
reference.system.primer
The system automatically assigns a path during ingestion. You can override with recategorize_memory or bulk_move_category.
Environment Variables
Copy .env.example to .env and fill in your values.
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | ✅ | — | PostgreSQL connection string |
OPENAI_API_KEY | ✅ | — | OpenAI API key (for embeddings + LLM) |
EMBEDDING_MODEL | text-embedding-3-small | OpenAI embedding model | |
CHAT_MODEL | gpt-4o-mini | LLM for categorization + primer synthesis | |
EMBED_DIM | 1536 | Embedding vector dimension | |
DEFAULT_SEARCH_LIMIT | 10 | Default result count for search_memory | |
DEFAULT_LIST_LIMIT | 50 | Default result count for list_categories | |
OPENAI_TIMEOUT_S | 60 | Per-request OpenAI timeout (seconds) | |
OPENAI_MAX_RETRIES | 5 | Exponential-backoff retry limit | |
MAX_CONCURRENT_API_CALLS | 5 | Semaphore for parallel OpenAI calls | |
PG_POOL_MIN | 1 | asyncpg min pool connections | |
PG_POOL_MAX | 10 | asyncpg max pool connections | |
LOG_LEVEL | INFO | DEBUG / INFO / WARNING | |
MCP_TRANSPORT | streamable-http | FastMCP transport mode | |
PRIMER_UPDATE_MAX_AGE_S | 3600 | Max seconds before auto primer regeneration | |
FASTMCP_JSON_RESPONSE | — | Set to 1 to force JSON responses |
Running with Docker (Recommended)
Prerequisites: Docker + Docker Compose.
# 1. Copy and fill in your environment variables
cp .env.example .env
$EDITOR .env
# 2. Start the stack (PostgreSQL + API + optional backup service)
docker compose up -d
# 3. The MCP server is now available at:
# http://localhost:8766/mcp
To rebuild after code changes:
docker compose up -d --build memory-api
Running Locally (Development)
Requirements: Python 3.11+, a running PostgreSQL instance with pgvector extension.
# 1. Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate
# 2. Install dependencies
pip install -r requirements.txt
# 3. Set environment variables
cp .env.example .env
$EDITOR .env
source .env # or use direnv
# 4. Start the server
python -m server
# Server running on http://0.0.0.0:8766
Backup Service
The backup/ directory contains a containerized PostgreSQL backup job that:
- Runs on a configurable interval (default: every 6 hours)
pg_dumps thememorydatabase tomemory.sql- Commits and pushes the dump to a private GitHub repository
Additional backup environment variables:
| Variable | Description |
|---|---|
GITHUB_PAT | GitHub Personal Access Token with repo scope |
GITHUB_BACKUP_REPO | Target repo in owner/repo format (e.g. isaacriehm/memory-backup) |
DB_PASSWORD | PostgreSQL password (same as in DATABASE_URL) |
BACKUP_INTERVAL_SECONDS | Seconds between backups (default: 21600 = 6 hours) |
The backup service is included in docker-compose.yml and starts automatically with the stack.
Memory Visualization
visualize_memories.py generates an interactive graph of all memories at memory_map.html. Run it locally (requires DATABASE_URL in environment):
python visualize_memories.py
open memory_map.html