MCP Hub
Back to servers

nexus-mcp-ci

Unified MCP server combining hybrid search (vector + BM25 + code graph), structural code analysis, and persistent semantic memory. 15 tools, 25+ languages, <350MB RAM, fully local.

glama
Forks
1
Updated
Mar 13, 2026

Nexus-MCP

The only MCP server with hybrid search + code graph + semantic memory — fully local.

Nexus-MCP is a unified, local-first code intelligence server built for the Model Context Protocol. It combines vector search, BM25 keyword search, and structural graph analysis into a single process — giving AI agents precise, token-efficient code understanding without cloud dependencies.


Why Nexus-MCP?

AI coding agents waste tokens. A lot of them. Every time an agent reads full files to find a function, grep-searches for keywords that miss semantic intent, or makes multiple tool calls across disconnected servers — tokens burn. Nexus-MCP fixes this.

Token Efficiency: The Numbers

ScenarioWithout NexusWith NexusSavings
Find relevant code (agent reads 5-10 files manually)5,000–15,000 tokens500–2,000 tokens (summary mode)70–90%
Understand a symbol (grep + read file + read callers)3,000–8,000 tokens across 3-5 tool calls800–2,000 tokens in 1 explain call60–75%
Assess change impact (manual trace through codebase)10,000–20,000 tokens1,000–3,000 tokens via impact tool80–85%
Tool descriptions in context (2 separate MCP servers)~1,700 tokens (17 tools)~1,000 tokens (15 consolidated)40%
Search precision (keyword-only misses, needs retries)2–3 searches × 2,000 tokens1 hybrid search × 1,500 tokens60–75%

Estimated savings per coding session: 15,000–40,000 tokens (30–60% reduction) compared to standalone agentic file browsing.

Three Verbosity Levels

Every tool respects a token budget — agents request only the detail they need:

LevelBudgetWhat's ReturnedUse Case
summary~500 tokensCounts, scores, file:line pointersQuick lookups, triage
detailed~2,000 tokensSignatures, types, line ranges, docstringsNormal development
full~8,000 tokensFull code snippets, relationships, metadataDeep analysis

vs. Standalone Agentic Development (No Code MCP)

Without a code intelligence server, AI agents must:

  • Read entire files to find one function (~500–2,000 tokens/file, often 5–10 files per query)
  • Grep for keywords that miss semantic intent ("auth" won't find "verify_credentials")
  • Manually trace call chains by reading file after file
  • Lose all context between sessions — no persistent memory

Nexus-MCP replaces this with targeted retrieval: semantic search returns the exact chunks needed, graph queries trace relationships instantly, and memory persists across sessions.

vs. Competitor MCP Servers

FeatureNexus-MCPSourcegraph MCPGreptile MCPGitHub MCPtree-sitter MCP
Local / privateYesNo (infra required)No (cloud)No (cloud)Yes
Semantic searchYes (embeddings)No (keyword)Yes (LLM-based)No (keyword)No
Keyword searchYes (BM25)YesN/AYesNo
Hybrid fusionYes (RRF)NoNoNoNo
Code graphYes (rustworkx)Yes (SCIP)NoNoNo
Re-rankingYes (FlashRank)NoN/ANoNo
Semantic memoryYes (6 types)NoNoNoNo
Change impactYesPartialNoNoNo
Token budgetingYes (3 levels)NoNoNoNo
Languages25+30+ManyManyMany
CostFree$$$$40/mo$10–39/moFree
API keys neededNoYesYesYesNo

vs. AI Code Tools (Cursor, Copilot, Cody, etc.)

CapabilityNexus-MCPCursorCopilot @workspaceSourcegraph CodyContinue.devAider
IDE-agnosticYesNoNoNoNoYes
MCP-nativeYesPartialNoNoYes (client)No
Fully localYesPartialNoPartialYesYes
Hybrid searchYesUnknownUnknownKeywordYesNo
Code graphYesUnknownUnknownYes (SCIP)BasicNo
Semantic memoryYes (persistent)NoNoNoNoNo
Token-budgeted responsesYesN/AN/AN/AN/AN/A
Open sourceYes (MIT)NoNoPartialYesYes
CostFree$20–40/mo$10–39/mo$0–49/moFreeFree

Nexus-MCP's unique combination: No other tool delivers hybrid search + code graph + semantic memory + token budgeting + full privacy in a single MCP server.


Key Features

  • Hybrid search — Vector (semantic) + BM25 (keyword) + graph (structural) fused via Reciprocal Rank Fusion, then re-ranked with FlashRank
  • Code graph — Structural analysis via rustworkx: callers, callees, imports, inheritance, change impact
  • Dual parsing — tree-sitter (symbol extraction) + ast-grep (structural relationships), 25+ languages
  • Semantic memory — Persistent knowledge store with TTL expiration, 6 memory types, semantic recall
  • Explain & Impact — "What does this do?" and "What breaks if I change it?" in single tool calls
  • Token-budgeted responses — Three verbosity levels (summary/detailed/full) keep context windows lean
  • Multi-folder indexing — Index multiple directories in one call, processed folder-by-folder with shared engines
  • Incremental indexing — Only re-processes changed files; file watcher support
  • Multi-model embeddings — 2 models (jina-code default, bge-small-en), GPU/MPS auto-detection
  • Low memory — <350MB RAM target (ONNX Runtime ~50MB, mmap vectors, lazy model loading)
  • Fully local — Zero cloud dependencies, no API keys, all processing on your machine
  • 15 tools, one server — Consolidates what previously required 2 MCP servers (17 tools) into one

Prerequisites

  • Python 3.10+ (tested on 3.10, 3.11, 3.12)
  • pip (comes with Python)

Install

Option 1: pip install from PyPI (recommended)

pip install nexus-mcp-ci

With optional extras:

# With GPU (CUDA) support
pip install nexus-mcp-ci[gpu]

# With FlashRank reranker for better search quality
pip install nexus-mcp-ci[reranker]

# Both
pip install nexus-mcp-ci[gpu,reranker]

Option 2: From source (for development)

git clone https://github.com/jaggernaut007/Nexus-MCP.git
cd Nexus-MCP

# Setup script (creates venv, installs, verifies)
./setup.sh

# Or manual install with dev deps
pip install -e ".[dev]"

Note: The default embedding model (jina-code) requires ONNX Runtime. This is included automatically. If you see errors about missing ONNX/Optimum, run:

pip install "sentence-transformers[onnx]" "optimum[onnxruntime]>=1.19.0"

To use a lighter model that doesn't need trust_remote_code, set NEXUS_EMBEDDING_MODEL=bge-small-en.

See the full Installation Guide for all options, MCP client integration, and troubleshooting.

Run

nexus-mcp

The server starts on stdio (the default MCP transport). Point your MCP client at the nexus-mcp command.

Add to Your MCP Client

Claude Code

# Basic setup
claude mcp add nexus-mcp-ci -- nexus-mcp-ci

# With a specific embedding model
claude mcp add nexus-mcp-ci -e NEXUS_EMBEDDING_MODEL=bge-small-en -- nexus-mcp-ci

Tip: If you installed in a virtual environment, use the full path so the MCP client finds the right Python:

claude mcp add nexus-mcp-ci -- /path/to/Nexus-MCP/.venv/bin/nexus-mcp-ci

Claude Desktop

Add to your config file (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "nexus-mcp-ci": {
      "command": "nexus-mcp-ci",
      "args": []
    }
  }
}

Cursor / Windsurf / Cline / Other MCP Clients

Add to your MCP client's server config:

{
  "nexus-mcp-ci": {
    "command": "nexus-mcp-ci",
    "transport": "stdio"
  }
}

See the full Installation Guide for client-specific instructions.

MCP Tools (15)

Core

ToolDescription
statusServer status, indexing stats, memory usage, next-tool hints
healthReadiness/liveness probe (uptime, engine availability)
indexIndex a codebase (full, incremental, or multi-folder)
searchPreferred over Grep/Glob. Semantic search returning code snippets, absolute paths, and scores

Graph Analysis

ToolDescription
find_symbolPreferred over Grep for definitions — returns location, types, and call relationships
find_callersFind all direct callers via call graph (more accurate than text search)
find_calleesTrace execution flow — all functions called by a given function
analyzeCode complexity, dependencies, smells, and quality metrics
impactUse before refactoring. Transitive change impact analysis
explainPreferred over Read for understanding symbols — graph + vector + analysis
overviewPreferred over Glob/ls. Project overview: files, languages, symbols, quality
architecturePreferred over manual browsing. Layers, dependencies, entry points, hubs

Memory

ToolDescription
rememberStore a semantic memory with tags and TTL
recallSearch memories by semantic similarity
forgetDelete memories by ID, tags, or type

Configuration

All settings can be overridden via NEXUS_ environment variables:

VariableDefaultDescription
NEXUS_STORAGE_DIR.nexusStorage directory for indexes
NEXUS_EMBEDDING_MODELjina-codeEmbedding model (jina-code, bge-small-en)
NEXUS_EMBEDDING_DEVICEautoDevice for embeddings: auto (CUDA > MPS > CPU), cuda, mps, cpu
NEXUS_MAX_FILE_SIZE_MB10Skip files larger than this
NEXUS_CHUNK_MAX_CHARS4000Max code snippet size per chunk
NEXUS_MAX_MEMORY_MB350Memory budget
NEXUS_SEARCH_MODEhybridSearch mode: hybrid, vector, or bm25
NEXUS_FUSION_WEIGHT_VECTOR0.5Vector engine weight in RRF
NEXUS_FUSION_WEIGHT_BM250.3BM25 engine weight in RRF
NEXUS_FUSION_WEIGHT_GRAPH0.2Graph engine weight in RRF
NEXUS_LOG_LEVELINFOLogging level
NEXUS_LOG_FORMATtextLog format: text or json

Self-Test Demo

Verify your installation by running the end-to-end demo that exercises all 15 tools:

python self_test/demo_mcp.py                  # Uses built-in sample project
python self_test/demo_mcp.py /path/to/project  # Or test against your own codebase

See self_test/README.md for details.

Development

pip install -e ".[dev]"     # Install with dev deps
pytest -v                   # Run tests (441 tests)
pytest -m "not slow"        # Skip performance benchmarks
ruff check .                # Lint
nexus-mcp-ci                # Run server

How It Works

search("how does auth work")
  |
  |-- vector_engine.search(query, n=30)    -- semantic similarity (embeddings)
  |-- bm25_engine.search(query, n=30)      -- keyword matching (exact terms)
  |-- graph_engine.boost(query, n=30)      -- structural relevance (callers/callees)
  |                                            |
  |              Reciprocal Rank Fusion (weights: 0.5 / 0.3 / 0.2)
  |                                            |
  |                        FlashRank re-ranking (top 20)
  |                                            |
  |                      Token budget truncation (summary/detailed/full)
  |                                            |
  v
  Top-N results, formatted to verbosity level

Architecture

ComponentTechnologyWhy
Vector storeLanceDBDisk-backed, mmap, ~20-50MB overhead, native FTS
EmbeddingsONNX Runtime + jina-code (default)~50MB vs PyTorch ~500MB, GPU/MPS auto-detection, 3 models supported
Graph enginerustworkxRust-backed, O(1) node/edge lookup, PageRank, centrality
Symbol parsertree-sitter25+ languages, AST-level symbol extraction
Graph parserast-grepStructural pattern matching for calls/imports/inheritance
ChunkingSymbol-basedOne chunk per function/class, deterministic IDs
Re-rankerFlashRank (optional)4MB ONNX model, <10ms for top-20
PersistenceSQLite + LanceDBGraph in SQLite, vectors in Lance, zero-config

Documentation

  • Installation Guide — Prerequisites, install steps, MCP client integration, troubleshooting
  • Architecture — System design, data flow, components, memory budget
  • Usage Guide — Tool reference, configuration, best practices
  • Developer Guide — Setup, testing, contributing, adding tools/engines
  • ADRs — 11 Architecture Decision Records
  • Research Notes — Deep dives on libraries and technology choices

Acknowledgments

Nexus-MCP consolidates and extends two earlier projects:

  • CodeGrok MCP by rdondeti (Ravitez Dondeti) — Semantic code search with tree-sitter parsing, embedding service, parallel indexing, and memory retrieval. Core models, symbol extraction, and the embedding pipeline were ported from CodeGrok. Originally licensed under MIT.
  • code-graph-mcp by entrepeneur4lyf — Code graph analysis with ast-grep structural parsing, rustworkx graph engine, and complexity analysis. Graph models, relationship extraction, and code analysis were ported from code-graph-mcp.

Individual source files retain "Ported from" attribution in their module docstrings. See ADR-001 for the rationale behind the consolidation.

License

MIT — see LICENSE for details.

Reviews

No reviews yet

Sign in to write a review