memory-mcp

A self-organizing, persistent semantic memory layer for AI agents, exposed as an MCP (Model Context Protocol) server over HTTP. Memories are stored in PostgreSQL with pgvector, embedded via OpenAI, and retrieved through hybrid vector + keyword search with Reciprocal Rank Fusion (RRF). The system autonomously chunks, deduplicates, and categorizes content—no manual schema management required.

Architecture

AI Agent / Claude / Cursor
        │  HTTP (MCP)
        ▼
  server.py  ──── FastMCP (tools.py)
        │
        ├── config.py       Environment & OpenAI client
        ├── llm.py          Embedding + LLM calls
        ├── db.py           PostgreSQL pool, schema, dedup logic
        └── utils.py        Chunking, retries, ID generation

  PostgreSQL + pgvector
        ├── memories        Chunks, embeddings, category path (ltree), metadata
        ├── memory_edges    Graph edges (sequence_next, relates_to)
        └── reference.system.primer  Auto-synthesized global context

The server starts a background timer that triggers automatic System Primer regeneration whenever enough new memories accumulate (threshold: 10 changes, max age: 1 hour).

MCP Tools

Tool	Description
`memorize_context`	Ingest raw text. Chunks, embeds, categorizes, and deduplicates automatically. Supports optional `ttl_days`.
`search_memory`	Hybrid vector + BM25 search with RRF. Filter by `category_path` (e.g. `projects.hardpoint`).
`initialize_context`	Fast bootstrap: returns all `reference.system.*` memories. Call this at the start of every session.
`traverse_sequence`	Walk `sequence_next` edges forward or backward from a memory ID.
`list_categories`	Return all occupied taxonomy paths with item counts.
`delete_memory`	Hard-delete a memory by ID (cascades edges).
`prune_history`	Batch-delete superseded (merged) memories older than N days.
`recategorize_memory`	Move a single memory to a new taxonomy path.
`bulk_move_category`	Move an entire taxonomy branch (e.g. old prefix → new prefix).
`synthesize_system_primer`	Force a full scan → LLM summary stored at `reference.system.primer`. Use sparingly.

Taxonomy

Memories are organized into a hierarchical dot-path taxonomy using PostgreSQL ltree:

user.profile.personal.focus
user.health_profile.medical
projects.<name>.architecture.stack
organizations.<name>.business.subscription
concepts.ai_interaction.behavior.code_execution
reference.system.primer

The system automatically assigns a path during ingestion. You can override with recategorize_memory or bulk_move_category.

Environment Variables

Copy .env.example to .env and fill in your values.

Variable	Required	Default	Description
`DATABASE_URL`	✅	—	PostgreSQL connection string
`OPENAI_API_KEY`	✅	—	OpenAI API key (for embeddings + LLM)
`EMBEDDING_MODEL`		`text-embedding-3-small`	OpenAI embedding model
`CHAT_MODEL`		`gpt-4o-mini`	LLM for categorization + primer synthesis
`EMBED_DIM`		`1536`	Embedding vector dimension
`DEFAULT_SEARCH_LIMIT`		`10`	Default result count for `search_memory`
`DEFAULT_LIST_LIMIT`		`50`	Default result count for `list_categories`
`OPENAI_TIMEOUT_S`		`60`	Per-request OpenAI timeout (seconds)
`OPENAI_MAX_RETRIES`		`5`	Exponential-backoff retry limit
`MAX_CONCURRENT_API_CALLS`		`5`	Semaphore for parallel OpenAI calls
`PG_POOL_MIN`		`1`	asyncpg min pool connections
`PG_POOL_MAX`		`10`	asyncpg max pool connections
`LOG_LEVEL`		`INFO`	`DEBUG` / `INFO` / `WARNING`
`MCP_TRANSPORT`		`streamable-http`	FastMCP transport mode
`PRIMER_UPDATE_MAX_AGE_S`		`3600`	Max seconds before auto primer regeneration
`FASTMCP_JSON_RESPONSE`		—	Set to `1` to force JSON responses

Running with Docker (Recommended)

Prerequisites: Docker + Docker Compose.

# 1. Copy and fill in your environment variables
cp .env.example .env
$EDITOR .env

# 2. Start the stack (PostgreSQL + API + optional backup service)
docker compose up -d

# 3. The MCP server is now available at:
#    http://localhost:8766/mcp

To rebuild after code changes:

docker compose up -d --build memory-api

Running Locally (Development)

Requirements: Python 3.11+, a running PostgreSQL instance with pgvector extension.

# 1. Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set environment variables
cp .env.example .env
$EDITOR .env
source .env  # or use direnv

# 4. Start the server
python -m server
# Server running on http://0.0.0.0:8766

Backup Service

The backup/ directory contains a containerized PostgreSQL backup job that:

Runs on a configurable interval (default: every 6 hours)
pg_dumps the memory database to memory.sql
Commits and pushes the dump to a private GitHub repository

Additional backup environment variables:

Variable	Description
`GITHUB_PAT`	GitHub Personal Access Token with `repo` scope
`GITHUB_BACKUP_REPO`	Target repo in `owner/repo` format (e.g. `isaacriehm/memory-backup`)
`DB_PASSWORD`	PostgreSQL password (same as in `DATABASE_URL`)
`BACKUP_INTERVAL_SECONDS`	Seconds between backups (default: `21600` = 6 hours)

The backup service is included in docker-compose.yml and starts automatically with the stack.

Memory Visualization

visualize_memories.py generates an interactive graph of all memories at memory_map.html. Run it locally (requires DATABASE_URL in environment):

python visualize_memories.py
open memory_map.html

memory-mcp

memory-mcp

Architecture

MCP Tools

Taxonomy

Environment Variables

Running with Docker (Recommended)

Running Locally (Development)

Backup Service

Memory Visualization

Reviews