MCP Hub
Back to servers

mcp-codebase-index

Structural codebase indexer with 17 query tools. 87% token reduction. Zero dependencies.

RegistryGitHub
Stars
12
Forks
1
Updated
Feb 17, 2026
Validated
Feb 19, 2026

Quick Install

uvx mcp-codebase-index

mcp-codebase-index

PyPI version CI Python 3.11+ License: AGPL-3.0 MCP Zero Dependencies

A structural codebase indexer with an MCP server for AI-assisted development. Zero runtime dependencies — uses Python's ast module for Python analysis and regex for TypeScript/JS. Requires Python 3.11+.

What It Does

Indexes codebases by parsing source files into structural metadata -- functions, classes, imports, dependency graphs, and cross-file call chains -- then exposes 17 query tools via the Model Context Protocol, enabling Claude Code and other MCP clients to navigate codebases efficiently without reading entire files.

Language Support

LanguageMethodExtracts
Python (.py)AST parsingFunctions, classes, methods, imports, dependency graph
TypeScript/JS (.ts, .tsx, .js, .jsx)Regex-basedFunctions, arrow functions, classes, interfaces, type aliases, imports
Markdown/Text (.md, .txt, .rst)Heading detectionSections (# headings, underlines, numbered, ALL-CAPS)
OtherGenericLine counts only

Installation

pip install "mcp-codebase-index[mcp]"

The [mcp] extra includes the MCP server dependency. Omit it if you only need the programmatic API.

For development (from a local clone):

pip install -e ".[dev,mcp]"

MCP Server

Running

# As a console script
PROJECT_ROOT=/path/to/project mcp-codebase-index

# As a Python module
PROJECT_ROOT=/path/to/project python -m mcp_codebase_index.server

PROJECT_ROOT specifies which directory to index. Defaults to the current working directory.

Configuring with OpenClaw

Install the package on the machine where OpenClaw is running:

# Local install
pip install "mcp-codebase-index[mcp]"

# Or inside a Docker container / remote VPS
docker exec -it openclaw bash
pip install "mcp-codebase-index[mcp]"

Add the MCP server to your OpenClaw agent config (openclaw.json):

{
  "agents": {
    "list": [{
      "id": "main",
      "mcp": {
        "servers": [
          {
            "name": "codebase-index",
            "command": "mcp-codebase-index",
            "env": {
              "PROJECT_ROOT": "/path/to/project"
            }
          }
        ]
      }
    }]
  }
}

Restart OpenClaw and verify the connection:

openclaw mcp list

All 17 tools will be available to your agent.

Performance note: OpenClaw's default MCP integration via mcporter spawns a fresh server process per tool call, which means the index is rebuilt each time (~1-2s for small projects, longer for large ones). For persistent connections, use the openclaw-mcp-adapter plugin, which connects once at startup and keeps the server running:

pip install openclaw-mcp-adapter

Configuring with Claude Code

Add to your project's .mcp.json:

{
  "mcpServers": {
    "codebase-index": {
      "command": "mcp-codebase-index",
      "env": {
        "PROJECT_ROOT": "/path/to/project"
      }
    }
  }
}

Or using the Python module directly (useful if installed in a virtualenv):

{
  "mcpServers": {
    "codebase-index": {
      "command": "/path/to/.venv/bin/python3",
      "args": ["-m", "mcp_codebase_index.server"],
      "env": {
        "PROJECT_ROOT": "/path/to/project"
      }
    }
  }
}

Tip: Encourage the AI to Use Indexed Tools

By default, AI assistants may still read entire files instead of using the indexed tools. Add this to your project's CLAUDE.md (or equivalent instructions file) to nudge it:

Prefer using codebase-index MCP tools (get_project_summary, find_symbol, get_function_source,
get_class_source, get_dependencies, get_dependents, get_change_impact, get_call_chain, etc.)
over reading entire files when navigating the codebase.

This ensures the AI reaches for surgical indexed queries first, which saves tokens and context window.

Available Tools (17)

ToolDescription
get_project_summaryFile count, packages, top classes/functions
list_filesList indexed files with optional glob filter
get_structure_summaryStructure of a file or the whole project
get_functionsList functions with name, lines, params
get_classesList classes with name, lines, methods, bases
get_importsList imports with module, names, line
get_function_sourceFull source of a function/method
get_class_sourceFull source of a class
find_symbolFind where a symbol is defined (file, line, type)
get_dependenciesWhat a symbol calls/uses
get_dependentsWhat calls/uses a symbol
get_change_impactDirect + transitive dependents
get_call_chainShortest dependency path (BFS)
get_file_dependenciesFiles imported by a given file
get_file_dependentsFiles that import from a given file
search_codebaseRegex search across all files (max 100 results)
reindexRe-index the project after file changes (MCP server only)

How Is This Different from LSP?

LSP answers "where is this function?" — mcp-codebase-index answers "what happens if I change it?" LSP is point queries: one symbol, one file, one position. It can tell you where LLMClient is defined and who references it. But ask "what breaks transitively if I refactor LLMClient?" and LSP has nothing. This tool returns 11 direct dependents and 31 transitive impacts in a single call — 204 characters. To get the same answer from LSP, the AI would need to chain dozens of find-reference calls recursively, reading files at every step, burning thousands of tokens to reconstruct what the dependency graph already knows.

LSP also requires you to install a separate language server for every language in your project — pyright for Python, vtsls for TypeScript, gopls for Go. Each one is a heavyweight binary with its own dependencies and configuration. mcp-codebase-index is zero dependencies, handles Python + TypeScript/JS + Markdown out of the box, and every response has built-in token budget controls (max_results, max_lines). LSP was built for IDEs. This was built for AI.

Programmatic Usage

from mcp_codebase_index.project_indexer import ProjectIndexer
from mcp_codebase_index.query_api import create_project_query_functions

indexer = ProjectIndexer("/path/to/project", include_patterns=["**/*.py"])
index = indexer.index()
query_funcs = create_project_query_functions(index)

# Use query functions
print(query_funcs["get_project_summary"]())
print(query_funcs["find_symbol"]("MyClass"))
print(query_funcs["get_change_impact"]("some_function"))

Development

pip install -e ".[dev,mcp]"
pytest tests/ -v
ruff check src/ tests/

References

The structural indexer was originally developed as part of the RMLPlus project, an implementation of the Recursive Language Models framework.

License

This project is dual-licensed:

If you're using mcp-codebase-index as a standalone MCP server for development, the AGPL-3.0 license applies at no cost. If you're embedding it in a proprietary product or offering it as part of a hosted service, you'll need a commercial license. See COMMERCIAL-LICENSE.md for details.

Reviews

No reviews yet

Sign in to write a review