MCP Hub
Back to servers

Recall

A long-term memory system featuring semantic search, relationship tracking, and namespace isolation to provide persistent context for AI assistants.

Tools
6
Updated
Jan 14, 2026

Recall

Long-term memory system for MCP-compatible AI assistants with semantic search and relationship tracking.

Features

  • Persistent Memory Storage: Store preferences, decisions, patterns, and session context
  • Semantic Search: Find relevant memories using natural language queries via ChromaDB vectors
  • Memory Relationships: Create edges between memories (supersedes, relates_to, caused_by, contradicts)
  • Namespace Isolation: Global memories vs project-scoped memories
  • Context Generation: Auto-format memories for session context injection
  • Deduplication: Content-hash based duplicate detection

Installation

# Clone the repository
git clone https://github.com/yourorg/recall.git
cd recall

# Install with uv
uv sync

# Ensure Ollama is running with required models
ollama pull mxbai-embed-large  # Required: embeddings for semantic search
ollama pull llama3.2           # Optional: session summarization for auto-capture hook
ollama serve

Usage

Run as MCP Server

uv run python -m recall

CLI Options

uv run python -m recall --help

Options:
  --sqlite-path PATH      SQLite database path (default: ~/.recall/recall.db)
  --chroma-path PATH      ChromaDB storage path (default: ~/.recall/chroma_db)
  --collection NAME       ChromaDB collection name (default: memories)
  --ollama-host HOST      Ollama server URL (default: http://localhost:11434)
  --ollama-model MODEL    Embedding model (default: mxbai-embed-large)
  --ollama-timeout SECS   Request timeout (default: 30)
  --log-level LEVEL       DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO)

meta-mcp Configuration

Add Recall to your meta-mcp servers.json:

{
  "recall": {
    "command": "uv",
    "args": [
      "run",
      "--directory",
      "/path/to/recall",
      "python",
      "-m",
      "recall"
    ],
    "env": {
      "RECALL_LOG_LEVEL": "INFO",
      "RECALL_OLLAMA_HOST": "http://localhost:11434",
      "RECALL_OLLAMA_MODEL": "mxbai-embed-large"
    },
    "description": "Long-term memory system with semantic search",
    "tags": ["memory", "context", "semantic-search"]
  }
}

Or for Claude Code / other MCP clients (claude.json):

{
  "mcpServers": {
    "recall": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/recall",
        "python",
        "-m",
        "recall"
      ],
      "env": {
        "RECALL_LOG_LEVEL": "INFO"
      }
    }
  }
}

Environment Variables

VariableDefaultDescription
RECALL_SQLITE_PATH~/.recall/recall.dbSQLite database file path
RECALL_CHROMA_PATH~/.recall/chroma_dbChromaDB persistent storage directory
RECALL_COLLECTION_NAMEmemoriesChromaDB collection name
RECALL_OLLAMA_HOSThttp://localhost:11434Ollama server URL
RECALL_OLLAMA_MODELmxbai-embed-largeEmbedding model name
RECALL_OLLAMA_TIMEOUT30Ollama request timeout in seconds
RECALL_LOG_LEVELINFOLogging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
RECALL_DEFAULT_NAMESPACEglobalDefault namespace for memories
RECALL_DEFAULT_IMPORTANCE0.5Default importance score (0.0-1.0)
RECALL_DEFAULT_TOKEN_BUDGET4000Default token budget for context

MCP Tool Examples

memory_store_tool

Store a new memory with semantic indexing. Uses fast daemon path when available (<10ms), falls back to sync embedding otherwise.

{
  "content": "User prefers dark mode in all applications",
  "memory_type": "preference",
  "namespace": "global",
  "importance": 0.8,
  "metadata": {"source": "explicit_request"}
}

Response (fast path via daemon):

{
  "success": true,
  "queued": true,
  "queue_id": 42,
  "namespace": "global"
}

Response (sync path fallback):

{
  "success": true,
  "queued": false,
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "content_hash": "a1b2c3d4e5f67890"
}

daemon_status_tool

Check if the recall daemon is running:

{}

Response:

{
  "running": true,
  "status": {
    "pid": 12345,
    "store_queue": {"pending_count": 5},
    "embed_worker_running": true
  }
}

memory_recall_tool

Search memories by semantic similarity:

{
  "query": "user interface preferences",
  "n_results": 5,
  "namespace": "global",
  "memory_type": "preference",
  "min_importance": 0.5,
  "include_related": true
}

Response:

{
  "success": true,
  "memories": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "User prefers dark mode in all applications",
      "type": "preference",
      "namespace": "global",
      "importance": 0.8,
      "created_at": "2024-01-15T10:30:00",
      "accessed_at": "2024-01-15T14:22:00",
      "access_count": 3
    }
  ],
  "total": 1,
  "score": 0.92
}

memory_relate_tool

Create a relationship between memories:

{
  "source_id": "mem_new_123",
  "target_id": "mem_old_456",
  "relation": "supersedes",
  "weight": 1.0
}

Response:

{
  "success": true,
  "edge_id": 42
}

memory_context_tool

Generate formatted context for session injection:

{
  "query": "coding style preferences",
  "project": "myproject",
  "token_budget": 4000
}

Response:

{
  "success": true,
  "context": "# Memory Context\n\n## Preferences\n\n- User prefers dark mode [global]\n- Use 2-space indentation [project:myproject]\n\n## Recent Decisions\n\n- Decided to use FastAPI for the backend [project:myproject]\n",
  "token_estimate": 125
}

memory_forget_tool

Delete memories by ID or semantic search:

{
  "memory_id": "550e8400-e29b-41d4-a716-446655440000",
  "confirm": true
}

Or delete by search:

{
  "query": "outdated preferences",
  "namespace": "project:oldproject",
  "n_results": 10,
  "confirm": true
}

Response:

{
  "success": true,
  "deleted_ids": ["550e8400-e29b-41d4-a716-446655440000"],
  "deleted_count": 1
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     MCP Server (FastMCP)                     │
│  memory_store │ memory_recall │ memory_relate │ memory_forget │
└───────────────────────────┬─────────────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │   FAST PATH       │       │   SYNC PATH       │
    │   <10ms           │       │   10-60s          │
    └─────────┬─────────┘       └─────────┬─────────┘
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │  recall-daemon    │       │   HybridStore     │
    │  (Unix socket)    │       │ (Direct Ollama)   │
    │                   │       └─────────┬─────────┘
    │  ┌─────────────┐  │                 │
    │  │ StoreQueue  │  │     ┌───────────┼───────────┐
    │  │ EmbedWorker │  │     │           │           │
    │  └─────────────┘  │     │           │           │
    └─────────┬─────────┘   ┌─▼─────┐ ┌───▼───┐ ┌─────▼─────┐
              │             │SQLite │ │Chroma │ │  Ollama   │
              └─────────────►Store  │ │ Store │ │  Client   │
                            └───────┘ └───────┘ └───────────┘

The daemon provides fast (<10ms) memory storage by queueing operations and processing embeddings asynchronously. When the daemon is unavailable, the MCP server falls back to synchronous embedding (10-60s).

Daemon Setup (macOS)

The recall daemon provides fast (<10ms) memory storage by processing embeddings asynchronously. Without the daemon, each store operation blocks for 10-60 seconds waiting for Ollama embeddings.

Quick Install

# From the recall directory
./hooks/install-daemon.sh

This will:

  1. Copy hook scripts to ~/.claude/hooks/
  2. Install the launchd plist to ~/Library/LaunchAgents/
  3. Start the daemon automatically

Manual Install

# 1. Copy hook scripts
cp hooks/recall*.py ~/.claude/hooks/
chmod +x ~/.claude/hooks/recall*.py

# 2. Create logs directory
mkdir -p ~/.claude/hooks/logs

# 3. Install plist with path substitution
sed "s|{{HOME}}|$HOME|g; s|{{RECALL_DIR}}|$(pwd)|g" \
  hooks/com.recall.daemon.plist.template > ~/Library/LaunchAgents/com.recall.daemon.plist

# 4. Load the daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

Daemon Commands

# Check status
echo '{"cmd": "status"}' | nc -U /tmp/recall-daemon.sock | jq

# Stop daemon
launchctl unload ~/Library/LaunchAgents/com.recall.daemon.plist

# Start daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

# View logs
tail -f ~/.claude/hooks/logs/recall-daemon.log

Hooks Configuration

Add recall hooks to your Claude Code settings (~/.claude/settings.json). See hooks/settings.example.json for the full configuration.

Development

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest tests/

# Run tests with coverage
uv run pytest tests/ --cov=recall --cov-report=html

# Type checking
uv run mypy src/recall

# Run specific integration tests
uv run pytest tests/integration/test_mcp_server.py -v

Requirements

  • Python 3.13+
  • Ollama with:
    • mxbai-embed-large model (required for semantic search)
    • llama3.2 model (optional, for session auto-capture hook)
  • ~500MB disk space for ChromaDB indices

License

MIT

Reviews

No reviews yet

Sign in to write a review