MCP Hub
Back to servers

mcp-rag-server

A robust MCP server that exposes Retrieval-Augmented Generation (RAG) capabilities, allowing AI assistants like Claude to index, search, and query documents stored in ChromaDB.

Tools
7
Updated
Jan 2, 2026
Validated
Jan 9, 2026

MCP RAG Server

An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.

This is a standalone extraction from my production portfolio site. See it in action at danmonteiro.com.


The Problem

You're using Claude Code but:

  • No access to your documents — Claude can't search your knowledge base
  • Context is manual — you're copy-pasting relevant docs into prompts
  • RAG is disconnected — your vector database isn't accessible to AI tools
  • Integration is custom — every project builds its own RAG bridge

The Solution

MCP RAG Server provides:

  • Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
  • Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
  • Simple toolsrag_query, rag_search, index_document, get_stats
  • Zero config — point at ChromaDB and go
# In Claude Code, after configuring the server:
"Search my knowledge base for articles about RAG architecture"
# Claude automatically uses rag_query tool and gets relevant context

Results

From production usage:

Without MCP RAGWith MCP RAG
Manual context copy-pasteAutomatic retrieval
No document searchHybrid search built-in
Static knowledgeLive vector database
Custom integration per projectStandard MCP protocol

Design Philosophy

Why MCP?

MCP (Model Context Protocol) standardizes how AI applications connect to external tools:

┌──────────────┐     MCP Protocol     ┌──────────────┐
│  MCP Client  │◀────────────────────▶│  MCP Server  │
│ (Claude Code)│                      │ (This repo)  │
└──────────────┘                      └──────────────┘
                                             │
                                      ┌──────▼──────┐
                                      │ RAG Pipeline │
                                      │  (ChromaDB)  │
                                      └─────────────┘

Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.

Tools Exposed

ToolDescription
rag_queryQuery with hybrid search, returns formatted context
rag_searchRaw similarity search, returns chunks with scores
index_documentAdd a single document
index_documents_batchBatch index multiple documents
delete_by_sourceDelete all docs from a source
get_statsCollection statistics
clear_collectionClear all data (requires confirmation)

Quick Start

1. Prerequisites

# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma

# Set OpenAI API key (for embeddings)
export OPENAI_API_KEY="sk-..."

2. Install & Build

git clone https://github.com/0xrdan/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build

3. Configure Claude Code

Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):

{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "CHROMA_URL": "http://localhost:8000",
        "CHROMA_COLLECTION": "my_knowledge_base"
      }
    }
  }
}

4. Use in Claude Code

# Restart Claude Code to load the server
claude

# Now Claude has access to RAG tools:
"Index this document into my knowledge base: [paste content]"
"Search for information about transformer architectures"
"What do my docs say about error handling?"

API Reference

rag_query

Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.

// Input
{
  question: string;      // Required: the question to search for
  topK?: number;         // Optional: number of results (default: 5)
  threshold?: number;    // Optional: min similarity 0-1 (default: 0.5)
  filters?: object;      // Optional: metadata filters
}

// Output
{
  context: string;       // Formatted context for LLM
  chunks: [{
    content: string;
    score: number;
    metadata: object;
  }];
  stats: {
    totalChunks: number;
    avgSimilarity: number;
  };
}

rag_search

Raw similarity search without context formatting.

// Input
{
  query: string;         // Required: search query
  topK?: number;         // Optional: number of results (default: 10)
  filters?: object;      // Optional: metadata filters
}

// Output: Array of chunks with scores

index_document

Add a document to the knowledge base.

// Input
{
  id: string;            // Required: unique identifier
  title: string;         // Required: document title
  content: string;       // Required: document content
  source: string;        // Required: source identifier
  category?: string;     // Optional: category
  tags?: string[];       // Optional: tags array
}

// Output
{
  success: boolean;
  documentId: string;
  chunksIndexed: number;
}

get_stats

Get collection statistics.

// Output
{
  totalChunks: number;
  totalDocuments: number;
  // ... other stats from RAG pipeline
}

Configuration

Environment Variables

VariableRequiredDefaultDescription
OPENAI_API_KEYYes-OpenAI API key for embeddings
CHROMA_URLNohttp://localhost:8000ChromaDB URL
CHROMA_COLLECTIONNomcp_knowledge_baseCollection name
EMBEDDING_MODELNotext-embedding-3-largeEmbedding model
EMBEDDING_DIMENSIONSNoNativeReduced dimensions

Project Structure

mcp-rag-server/
├── src/
│   ├── server.ts        # Main MCP server implementation
│   └── index.ts         # Exports
├── mcp-config.example.json  # Example Claude Code configuration
├── package.json
└── README.md

Advanced Usage

Programmatic Server Creation

import { createServer } from 'mcp-rag-server';

const server = await createServer({
  vectorDB: {
    host: 'http://custom-chroma:8000',
    collectionName: 'my_collection',
  },
  rag: {
    topK: 10,
    enableHybridSearch: true,
  },
});

Using with Claude Desktop

Same configuration works with Claude Desktop's MCP support:

// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"]
    }
  }
}

Part of the Context Continuity Stack

This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.

LayerRoleThis Repo
Intra-sessionShort-term memory
Document-scopedInjected content
RetrievedLong-term semantic memory via MCPmcp-rag-server
ProgressiveStaged responses

MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.

Related repos:


Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feat/add-new-tool)
  3. Make changes with semantic commits
  4. Open a PR with clear description

License

MIT License - see LICENSE for details.


Acknowledgments

Built with Claude Code.

Co-Authored-By: Claude <noreply@anthropic.com>

Reviews

No reviews yet

Sign in to write a review