MCP RAG Server
An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.
This is a standalone extraction from my production portfolio site. See it in action at danmonteiro.com.
The Problem
You're using Claude Code but:
- No access to your documents — Claude can't search your knowledge base
- Context is manual — you're copy-pasting relevant docs into prompts
- RAG is disconnected — your vector database isn't accessible to AI tools
- Integration is custom — every project builds its own RAG bridge
The Solution
MCP RAG Server provides:
- Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
- Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
- Simple tools —
rag_query,rag_search,index_document,get_stats - Zero config — point at ChromaDB and go
# In Claude Code, after configuring the server:
"Search my knowledge base for articles about RAG architecture"
# Claude automatically uses rag_query tool and gets relevant context
Results
From production usage:
| Without MCP RAG | With MCP RAG |
|---|---|
| Manual context copy-paste | Automatic retrieval |
| No document search | Hybrid search built-in |
| Static knowledge | Live vector database |
| Custom integration per project | Standard MCP protocol |
Design Philosophy
Why MCP?
MCP (Model Context Protocol) standardizes how AI applications connect to external tools:
┌──────────────┐ MCP Protocol ┌──────────────┐
│ MCP Client │◀────────────────────▶│ MCP Server │
│ (Claude Code)│ │ (This repo) │
└──────────────┘ └──────────────┘
│
┌──────▼──────┐
│ RAG Pipeline │
│ (ChromaDB) │
└─────────────┘
Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.
Tools Exposed
| Tool | Description |
|---|---|
rag_query | Query with hybrid search, returns formatted context |
rag_search | Raw similarity search, returns chunks with scores |
index_document | Add a single document |
index_documents_batch | Batch index multiple documents |
delete_by_source | Delete all docs from a source |
get_stats | Collection statistics |
clear_collection | Clear all data (requires confirmation) |
Quick Start
1. Prerequisites
# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma
# Set OpenAI API key (for embeddings)
export OPENAI_API_KEY="sk-..."
2. Install & Build
git clone https://github.com/0xrdan/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build
3. Configure Claude Code
Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):
{
"mcpServers": {
"rag": {
"command": "node",
"args": ["/path/to/mcp-rag-server/dist/server.js"],
"env": {
"OPENAI_API_KEY": "sk-...",
"CHROMA_URL": "http://localhost:8000",
"CHROMA_COLLECTION": "my_knowledge_base"
}
}
}
}
4. Use in Claude Code
# Restart Claude Code to load the server
claude
# Now Claude has access to RAG tools:
"Index this document into my knowledge base: [paste content]"
"Search for information about transformer architectures"
"What do my docs say about error handling?"
API Reference
rag_query
Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.
// Input
{
question: string; // Required: the question to search for
topK?: number; // Optional: number of results (default: 5)
threshold?: number; // Optional: min similarity 0-1 (default: 0.5)
filters?: object; // Optional: metadata filters
}
// Output
{
context: string; // Formatted context for LLM
chunks: [{
content: string;
score: number;
metadata: object;
}];
stats: {
totalChunks: number;
avgSimilarity: number;
};
}
rag_search
Raw similarity search without context formatting.
// Input
{
query: string; // Required: search query
topK?: number; // Optional: number of results (default: 10)
filters?: object; // Optional: metadata filters
}
// Output: Array of chunks with scores
index_document
Add a document to the knowledge base.
// Input
{
id: string; // Required: unique identifier
title: string; // Required: document title
content: string; // Required: document content
source: string; // Required: source identifier
category?: string; // Optional: category
tags?: string[]; // Optional: tags array
}
// Output
{
success: boolean;
documentId: string;
chunksIndexed: number;
}
get_stats
Get collection statistics.
// Output
{
totalChunks: number;
totalDocuments: number;
// ... other stats from RAG pipeline
}
Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY | Yes | - | OpenAI API key for embeddings |
CHROMA_URL | No | http://localhost:8000 | ChromaDB URL |
CHROMA_COLLECTION | No | mcp_knowledge_base | Collection name |
EMBEDDING_MODEL | No | text-embedding-3-large | Embedding model |
EMBEDDING_DIMENSIONS | No | Native | Reduced dimensions |
Project Structure
mcp-rag-server/
├── src/
│ ├── server.ts # Main MCP server implementation
│ └── index.ts # Exports
├── mcp-config.example.json # Example Claude Code configuration
├── package.json
└── README.md
Advanced Usage
Programmatic Server Creation
import { createServer } from 'mcp-rag-server';
const server = await createServer({
vectorDB: {
host: 'http://custom-chroma:8000',
collectionName: 'my_collection',
},
rag: {
topK: 10,
enableHybridSearch: true,
},
});
Using with Claude Desktop
Same configuration works with Claude Desktop's MCP support:
// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
{
"mcpServers": {
"rag": {
"command": "node",
"args": ["/path/to/mcp-rag-server/dist/server.js"]
}
}
}
Part of the Context Continuity Stack
This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.
| Layer | Role | This Repo |
|---|---|---|
| Intra-session | Short-term memory | — |
| Document-scoped | Injected content | — |
| Retrieved | Long-term semantic memory via MCP | mcp-rag-server |
| Progressive | Staged responses | — |
MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.
Related repos:
- rag-pipeline — The underlying RAG implementation
- mcp-client-example — Reference client for connecting to this server
- chatbot-widget — Session cache, Research Mode, conversation export
- ai-orchestrator — Multi-model LLM routing
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feat/add-new-tool) - Make changes with semantic commits
- Open a PR with clear description
License
MIT License - see LICENSE for details.
Acknowledgments
Built with Claude Code.
Co-Authored-By: Claude <noreply@anthropic.com>