MCP Hub
Back to servers

test-for-MCP

A local Retrieval-Augmented Generation (RAG) server that allows for ingesting markdown files into a vector database for semantic search and retrieval using Sentence Transformers and FAISS.

Tools
4
Updated
Jan 21, 2026

EyeLevel RAG MCP Server

A local Retrieval-Augmented Generation (RAG) system implemented as an MCP (Model Context Protocol) server. This server allows you to ingest markdown files into a local knowledge base and perform semantic search to retrieve relevant context for LLM queries.

Features

  • Local RAG Implementation: No external dependencies or paid services required
  • Markdown File Support: Ingest and search through .md files
  • Semantic Search: Uses sentence transformers for embedding-based similarity search
  • Persistent Storage: Automatically saves and loads the vector index using FAISS
  • Chunk Management: Intelligently splits documents into searchable chunks
  • Multiple Documents: Support for ingesting and searching across multiple markdown files

Installation

  1. Clone this repository
  2. Install dependencies using uv:
    uv sync
    

Dependencies

  • sentence-transformers: For creating text embeddings
  • faiss-cpu: For efficient vector similarity search
  • numpy: For numerical operations
  • mcp[cli]: For the MCP server framework

Available Tools

1. search_doc_for_rag_context(query: str)

Searches the knowledge base for relevant context based on a user query.

Parameters:

  • query (str): The search query

Returns:

  • Relevant text chunks with relevance scores

2. ingest_markdown_file(local_file_path: str)

Ingests a markdown file into the knowledge base.

Parameters:

  • local_file_path (str): Path to the markdown file to ingest

Returns:

  • Status message indicating success or failure

3. list_indexed_documents()

Lists all documents currently in the knowledge base.

Returns:

  • Summary of indexed files and chunk counts

4. clear_knowledge_base()

Clears all documents from the knowledge base.

Returns:

  • Confirmation message

Usage

  1. Start the server:

    python main.py
    
  2. Ingest markdown files: Use the ingest_markdown_file tool to add your .md files to the knowledge base.

  3. Search for context: Use the search_doc_for_rag_context tool to find relevant information for your queries.

How It Works

  1. Document Processing: Markdown files are split into chunks based on paragraphs and sentence boundaries
  2. Embedding Creation: Text chunks are converted to embeddings using the all-MiniLM-L6-v2 model
  3. Vector Storage: Embeddings are stored in a FAISS index for fast similarity search
  4. Retrieval: User queries are embedded and matched against the stored vectors to find relevant content

File Structure

  • main.py: Main server implementation with RAG functionality
  • pyproject.toml: Project dependencies and configuration
  • rag_index.faiss: FAISS vector index (created automatically)
  • rag_documents.pkl: Serialized documents and metadata (created automatically)

Configuration

The RAG system uses the all-MiniLM-L6-v2 sentence transformer model by default. This model provides a good balance between speed and quality for semantic search tasks.

Example Workflow

  1. Prepare your markdown files with the content you want to search
  2. Use ingest_markdown_file to add each file to the knowledge base
  3. Use search_doc_for_rag_context to find relevant context for your questions
  4. The retrieved context can be used by an LLM to provide informed answers

Notes

  • The first time you run the server, it will download the sentence transformer model
  • The vector index is automatically saved and loaded between sessions
  • Long documents are automatically chunked to optimize search performance
  • The system supports multiple markdown files and maintains source file metadata

Reviews

No reviews yet

Sign in to write a review