MCP Hub
Back to servers

Memory MCP Server

A sophisticated persistent memory system for Claude featuring a tiered storage architecture with semantic vector search, high-frequency hot caching, and an automated pattern mining pipeline for project-specific knowledge.

Tools
17
Updated
Jan 19, 2026

Memory MCP Server

An Engram-inspired MCP server that gives Claude a "second brain" with:

  • Hot Cache: Zero-latency access to frequently-used patterns (auto-injected via MCP resource)
  • Cold Storage: Semantic search with confidence gating
  • Pattern Mining: Automatic extraction from output logs with frequency-based promotion

Security Note: This server is designed for local use only. It runs unauthenticated over STDIO transport and should not be exposed to networks or untrusted clients.

Architecture

graph TD
    subgraph "Hot Cache (Zero Latency)"
        H[System Prompt Resource] --> A[High-freq project facts]
        H --> B[Mined code patterns]
    end

    subgraph "Cold Storage (Tool Call)"
        C[Vector store] --> D[Semantic search]
    end

    subgraph "Mining Pipeline"
        F[Output logger] --> G[7-day rolling window]
        G --> I[Pattern extractor]
        I --> J[Frequency counter]
        J -->|Threshold reached| A
    end

    Claude --> H
    Claude --> D
    Claude --> F

Requirements

  • Python 3.10+
  • uv package manager

Dependencies

Core dependencies (installed automatically):

  • fastmcp>=2.0,<3 - MCP server framework
  • sqlite-vec>=0.1 - Vector similarity search extension
  • sentence-transformers>=3.0 - Embedding model
  • pydantic>=2.0 / pydantic-settings>=2.0 - Configuration
  • loguru>=0.7 - Logging

First Run

On first run, the embedding model (~90MB) downloads automatically from Hugging Face. This may add 30-60 seconds to initial startup depending on your connection.

Installation

# Clone the repository
git clone https://github.com/michael-denyer/memory-mcp.git
cd memory-mcp

# Install dependencies
uv sync

# Run tests to verify installation
uv run pytest

# Optional: Pre-download the embedding model
uv run python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"

Claude Code Integration

Add to your Claude Code settings (~/.claude.json or project .claude/settings.json):

{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["run", "--directory", "<path-to-memory-mcp>", "memory-mcp"]
    }
  }
}

Replace <path-to-memory-mcp> with the absolute path where you cloned the repository.

Restart Claude Code, then verify with /mcp - you should see the memory server's tools.

Tools

Cold Storage (Manual)

ToolDescription
remember(content, memory_type, tags)Store a memory
recall(query, limit, threshold)Semantic search with confidence gating
recall_by_tag(tag)Filter by tag
forget(memory_id)Delete a memory
list_memories(limit, offset, memory_type)Browse memories
memory_stats()Get statistics

Hot Cache

ToolDescription
hot_cache_status()Show hot cache contents
promote(memory_id)Manually promote to hot cache
demote(memory_id)Remove from hot cache

Mining

ToolDescription
log_output(content)Log output for mining
run_mining(hours)Extract patterns from logs
mining_status()Show mining statistics
review_candidates()Review patterns ready for promotion
approve_candidate(pattern_id)Approve and promote pattern
reject_candidate(pattern_id)Reject pattern

Maintenance

ToolDescription
db_info()Get database info (path, size, schema version)
db_maintenance()Run vacuum and analyze, reclaim space

Memory Types

  • project - Project-specific facts (architecture, conventions)
  • pattern - Reusable code patterns
  • reference - External docs, API notes
  • conversation - Facts from discussions

Confidence Gating

The recall tool returns results with confidence levels:

ConfidenceSimilarityAction
high> 0.85Use directly
medium0.7 - 0.85Verify context
low< 0.7Reason from scratch

Default threshold is 0.7 (configurable via DEFAULT_CONFIDENCE_THRESHOLD).

Hot Cache Resource

The server exposes memory://hot-cache as an MCP resource. Configure Claude Code to auto-include this resource for zero-latency access to frequently-used knowledge.

Configuration

Environment variables (prefix MEMORY_MCP_):

Database & Storage

VariableDefaultDescription
DB_PATH~/.memory-mcp/memory.dbSQLite database location

Embeddings

VariableDefaultDescription
EMBEDDING_MODELsentence-transformers/all-MiniLM-L6-v2Sentence transformer model
EMBEDDING_DIM384Embedding vector dimension (must match model)

Warning: Changing EMBEDDING_DIM after creating memories will cause retrieval failures. Delete the database or migrate if changing models.

Hot Cache

VariableDefaultDescription
HOT_CACHE_MAX_ITEMS20Maximum items in hot cache
PROMOTION_THRESHOLD3Access count for auto-promotion
DEMOTION_DAYS14Days without access before demotion

Mining

VariableDefaultDescription
MINING_ENABLEDtrueEnable pattern mining
LOG_RETENTION_DAYS7Days to retain output logs

Retrieval

VariableDefaultDescription
DEFAULT_RECALL_LIMIT5Default results per recall
DEFAULT_CONFIDENCE_THRESHOLD0.7Minimum similarity for results
HIGH_CONFIDENCE_THRESHOLD0.85Threshold for high confidence

Input Limits

VariableDefaultDescription
MAX_CONTENT_LENGTH100000Max characters per memory/log
MAX_RECALL_LIMIT100Max results per recall query
MAX_TAGS20Max tags per memory

Data Persistence

Database Location

By default, the SQLite database is stored at ~/.memory-mcp/memory.db. The directory is created automatically on first run.

Backups

# Backup the database
cp ~/.memory-mcp/memory.db ~/.memory-mcp/memory.db.backup

# Restore from backup
cp ~/.memory-mcp/memory.db.backup ~/.memory-mcp/memory.db

Changing Embedding Models

If you change EMBEDDING_MODEL or EMBEDDING_DIM, existing embeddings become incompatible. Options:

  1. Delete and rebuild (recommended for small datasets):

    rm ~/.memory-mcp/memory.db
    # Re-add memories after restart
    
  2. Use a separate database:

    export MEMORY_MCP_DB_PATH=~/.memory-mcp/memory-new-model.db
    

Development

Running Tests

uv run pytest -v

Running with Debug Logging

The server logs to stderr (required for STDIO MCP transport):

# Run directly with visible logs
uv run memory-mcp 2>&1 | head -50

Resource Usage

  • Disk: ~1-10 MB typical (depends on memory count)
  • Memory: ~200-400 MB (embedding model loaded in memory)
  • Startup: 2-5 seconds (after model is cached)

Example Usage

You: "Remember that this project uses PostgreSQL with pgvector"
Claude: [calls remember(..., memory_type="project")]
→ Stored as memory #1

You: "What database do we use?"
Claude: [calls recall("database configuration")]
→ {confidence: "high", memories: [{content: "This project uses PostgreSQL..."}]}

You: "Promote that to hot cache"
Claude: [calls promote(1)]
→ Memory #1 now in hot cache - zero latency access

License

MIT

Reviews

No reviews yet

Sign in to write a review