semantic-code-mcp
Local MCP server that provides semantic code search for Claude Code. Instead of iterative grep/glob, it indexes your codebase with embeddings and returns ranked results by meaning.
Python only for now — multi-language support (JS/TS, Rust, Go) is planned.
How It Works
Claude Code ──(MCP/STDIO)──▶ semantic-code-mcp server
│
┌───────────────┼───────────────┐
▼ ▼ ▼
AST Chunker Embedder LanceDB
(tree-sitter) (sentence-trans) (vectors)
- Chunking — tree-sitter parses Python into functions, classes, and methods
- Embedding — sentence-transformers encodes each chunk (all-MiniLM-L6-v2, 384d)
- Storage — vectors stored in LanceDB (embedded, like SQLite)
- Search — hybrid semantic + keyword search with recency boosting
Indexing is incremental (mtime-based) and uses git ls-files for fast file discovery. The embedding model loads lazily on first query.
Installation
# Via uvx (recommended)
uvx semantic-code-mcp
# Or install globally
uv tool install semantic-code-mcp
Claude Code Integration
claude mcp add --scope user semantic-code -- uvx semantic-code-mcp
MCP Tools
search_code
Search code by meaning, not just text matching. Auto-indexes on first search.
| Parameter | Type | Default | Description |
|---|---|---|---|
query | str | required | Natural language description of what you're looking for |
project_path | str | required | Absolute path to the project root |
limit | int | 10 | Maximum number of results |
Returns ranked results with file_path, line_start, line_end, name, chunk_type, content, and score.
index_codebase
Index a codebase for semantic search. Only processes new and changed files unless force=True.
| Parameter | Type | Default | Description |
|---|---|---|---|
project_path | str | required | Absolute path to the project root |
force | bool | False | Re-index all files regardless of changes |
index_status
Check indexing status for a project.
| Parameter | Type | Default | Description |
|---|---|---|---|
project_path | str | required | Absolute path to the project root |
Returns is_indexed, files_count, and chunks_count.
Configuration
All settings are environment variables with the SEMANTIC_CODE_MCP_ prefix (via pydantic-settings):
| Variable | Default | Description |
|---|---|---|
SEMANTIC_CODE_MCP_CACHE_DIR | ~/.cache/semantic-code-mcp | Where indexes are stored |
SEMANTIC_CODE_MCP_LOCAL_INDEX | false | Store index in .semantic-code/ within each project |
SEMANTIC_CODE_MCP_EMBEDDING_MODEL | all-MiniLM-L6-v2 | Sentence-transformers model |
SEMANTIC_CODE_MCP_DEBUG | false | Enable debug logging |
SEMANTIC_CODE_MCP_PROFILE | false | Enable pyinstrument profiling |
Tech Stack
| Component | Choice | Rationale |
|---|---|---|
| MCP Framework | FastMCP | Python decorators, STDIO transport |
| Embeddings | sentence-transformers | Local, no API costs, good quality |
| Vector Store | LanceDB | Embedded (like SQLite), no server needed |
| Chunking | tree-sitter | AST-based, respects code structure |
Development
uv sync # Install dependencies
uv run python -m semantic_code_mcp # Run server
uv run pytest # Run tests
uv run ruff check src/ # Lint
uv run ruff format src/ # Format
Architecture decisions are documented in docs/decisions/. Project planning lives in TODO.md.
License
MIT