semantic-code-mcp

Local MCP server that provides semantic code search for Claude Code. Instead of iterative grep/glob, it indexes your codebase with embeddings and returns ranked results by meaning.

Python only for now — multi-language support (JS/TS, Rust, Go) is planned.

How It Works

Claude Code ──(MCP/STDIO)──▶ semantic-code-mcp server
                                    │
                    ┌───────────────┼───────────────┐
                    ▼               ▼               ▼
              AST Chunker      Embedder        LanceDB
             (tree-sitter)  (sentence-trans)  (vectors)

Chunking — tree-sitter parses Python into functions, classes, and methods
Embedding — sentence-transformers encodes each chunk (all-MiniLM-L6-v2, 384d)
Storage — vectors stored in LanceDB (embedded, like SQLite)
Search — hybrid semantic + keyword search with recency boosting

Indexing is incremental (mtime-based) and uses git ls-files for fast file discovery. The embedding model loads lazily on first query.

Installation

# Via uvx (recommended)
uvx semantic-code-mcp

# Or install globally
uv tool install semantic-code-mcp

Claude Code Integration

claude mcp add --scope user semantic-code -- uvx semantic-code-mcp

MCP Tools

`search_code`

Search code by meaning, not just text matching. Auto-indexes on first search.

Parameter	Type	Default	Description
`query`	`str`	required	Natural language description of what you're looking for
`project_path`	`str`	required	Absolute path to the project root
`limit`	`int`	`10`	Maximum number of results

Returns ranked results with file_path, line_start, line_end, name, chunk_type, content, and score.

`index_codebase`

Index a codebase for semantic search. Only processes new and changed files unless force=True.

Parameter	Type	Default	Description
`project_path`	`str`	required	Absolute path to the project root
`force`	`bool`	`False`	Re-index all files regardless of changes

`index_status`

Check indexing status for a project.

Parameter	Type	Default	Description
`project_path`	`str`	required	Absolute path to the project root

Returns is_indexed, files_count, and chunks_count.

Configuration

All settings are environment variables with the SEMANTIC_CODE_MCP_ prefix (via pydantic-settings):

Variable	Default	Description
`SEMANTIC_CODE_MCP_CACHE_DIR`	`~/.cache/semantic-code-mcp`	Where indexes are stored
`SEMANTIC_CODE_MCP_LOCAL_INDEX`	`false`	Store index in `.semantic-code/` within each project
`SEMANTIC_CODE_MCP_EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Sentence-transformers model
`SEMANTIC_CODE_MCP_DEBUG`	`false`	Enable debug logging
`SEMANTIC_CODE_MCP_PROFILE`	`false`	Enable pyinstrument profiling

Tech Stack

Component	Choice	Rationale
MCP Framework	FastMCP	Python decorators, STDIO transport
Embeddings	sentence-transformers	Local, no API costs, good quality
Vector Store	LanceDB	Embedded (like SQLite), no server needed
Chunking	tree-sitter	AST-based, respects code structure

Development

uv sync                            # Install dependencies
uv run python -m semantic_code_mcp # Run server
uv run pytest                      # Run tests
uv run ruff check src/             # Lint
uv run ruff format src/            # Format

Architecture decisions are documented in docs/decisions/. Project planning lives in TODO.md.

License

MIT

semantic-code-mcp

semantic-code-mcp

How It Works

Installation

Claude Code Integration

MCP Tools

search_code

index_codebase

index_status

Configuration

Tech Stack

Development

License

Reviews

`search_code`

`index_codebase`

`index_status`