MCP Server

🧩 Main modules

Main MCP server - the core of the system, performing the following functions:

Registry of connected servers
Routing requests between servers
Monitoring server status
Aggregation of information about available tools

Specialized servers (connect to the main one):

Embedding server - working with vector representations of text
PDF extract server - conversion and extraction from PDF to Markdown
Reranker server - ranking text data
Qdrant server - managing vector collections
PostgreSQL server - executing SQL queries and schema inspection
LLM server - generating and streaming LLM responses, list of models
MarkUp server - text/file markup using markup service methods
Transcribe server - audio loading, status and transcription result

⚙️ Available tools

Server	Methods
Embedding server	`embedding_generate`, `embedding_batch_generate`, `embedding_get_models`, `health_check`
PDF extract server	`document_convert_to_markdown`, `document_get_supported_formats`, `health_check`
Reranker server	`rerank_documents`, `health_check`
Qdrant server	`vector_create_collection`, `vector_get_collection_info`, `vector_upsert_points`, `vector_search`, `vector_delete_points`, `health_check`
PostgreSQL server	`postgres_execute_query`, `postgres_get_schema`, `postgres_create_table`, `postgres_insert_data`, `health_check`
LLM server	`llm_chat_completion`, `llm_get_models`, `llm_stream_completion`, `health_check`
MarkUp server	`markup_get_methods`, `markup_process_text`, `markup_process_file`, `health_check`
Transcribe server	`transcribe_audio`, `transcribe_get_status`, `transcribe_get_result`, `health_check`

Note:

To add a new FastMCP server, you need to import it in the main_server.py file and place it in the MCP_SERVERS array, after which the main methods of main_server.py will have access to it.

Also a necessary requirement for FastMCP servers is the presence of the health_check method to check the state.

📡 Main server methods

get_server_and_tools() # Get a list of all servers and tools
router(server_name, tool_name, params) # Routing requests
health_check_servers() # Checking the health of all servers

Setting up the environment

Create a .env file in the root of the project and put the following environment variables in it (the list corresponds to the use in the code):

# Main server
MAIN_SERVER_API_KEY=...

# Embedding server
EMBEDDING_API_KEY=...
EMBEDDING_URL=...
EMBEDDING_MODEL_NAME=...
EMBEDDING_URL_MODELS=...
EMBEDDING_HEALTH_URL=...

# PDF extract server
PDF_EXTRACTOR_URL=...
PDF_HEALTH_URL=...

# Reranker server
RERANK_URL=...
RERANK_MODEL=...
RERANK_HEALTH_URL=...

# Qdrant server
QDRANT_URL=...
QDRANT_API_KEY=...
QDRANT_HEALTH_CHECK_URL=...

# PostgreSQL server
POSTGRES_USER=...
POSTGRES_PASSWORD=...
POSTGRES_HOST=...
POSTGRES_DB=...

#LLM server
LLM_SERVICE_API_KEY=...
LLM_SERVICE_MODEL=...
LLM_SERVICE_CHAT_COMPLETIONS_URL=...
LLM_SERVICE_MODELS_URL=...
LLM_SERVICE_COMPLETIONS_URL=...
LLM_SERVICE_HEALTH_URL=...

# MarkUp server
MARKUP_API_KEY=...
MARKUP_GET_METHODS_URL=...
MARKUP_PROCESS_TEXT_URL=...
MARKUP_PROCESS_FILE_URL=...
MARKUP_HEALTH_CHECK_URL=...

# Transcribe server
TRANSCRIBE_API_KEY=...
TRANSCRIBE_UPLOAD_AUDIO=...
TRANSCRIBE_HEALTH_URL=...

Start the main server

Installation dependencies and creating a virtual environment

Before starting the server, it is recommended to create a virtual environment and install all dependencies from requirements.txt. Run the following commands in the terminal:

Creating a virtual environment

python -m venv venv

Activating the environment

./venv/Scripts/activate

Installing dependencies

pip install -r requirements.txt

After setting up the environment, the server is started with the command

fastmcp run ./main_server.py:main_mcp_server --transport http

Running in Docker

Build the image:

docker build -t mcp-main-server .

Run the container, passing .env as environment variables:

docker run --rm -p 8000:8000 --env-file .env mcp-main-server

Configuring the server connection in Cursor

Run the server using the command above
Open the settings
Add the MCP server configuration:

{
"mcpServers": {
"main-registry": {
   "url": "http://localhost:8000/mcp/"
}}}

Local MCP server: proxy_mcp_server

What is it: proxy MCP server that connects to the main registry (main_server) and forwards its methods, and provides a high-level pipeline for pre-preparing data for RAG.
Where is it: proxy_mcp_server/proxy_mcp_server.py
Available tools:
get_server_and_tools() — get a list of servers and their tools from the registry
router(server_name: str, tool_name: str, params: dict) — universal call router
preprocessing_data_for_rag(file_paths: List[str]) -> str — prepare PDF/texts and create a collection in Qdrant; returns the collection name
health_check_servers() — check if all services are available

Requirements:

main_server is running and accessible via URL (e.g. http://localhost:8000/mcp/).
Valid API key MAIN_SERVER_API_KEY (must match Authorization header in proxy_mcp_server.py).
Update url and headers.Authorization in config object inside proxy_mcp_server/proxy_mcp_server.py if necessary.

Connection in Cursor (example):

{
"mcpServers": {
"proxy-server": {
"command": "uv",
"args": [
"run",
"fastmcp",
"run",
"YOUR_PATH_TO/proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server"]
}}}

Launch from terminal:

fastmcp run ./proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server

RAG inference: interactive launch

What is this: console assistant for asking questions to a collection of documents in Qdrant with additional ranking and generation of LLM response.
Location: rag_inference/RAG workflow.py
Preliminary environment variables: QDRANT_URL, QDRANT_API_KEY, RERANK_URL, RERANK_MODEL, LLM_SERVICE_CHAT_COMPLETIONS_URL, LLM_SERVICE_API_KEY, LLM_SERVICE_MODEL, EMBEDDING_URL, EMBEDDING_MODEL are used (described above in the settings section).

Run (Windows PowerShell):

python ".\rag_inference\RAG workflow.py" <collection_name>

Where <collection_name> is the name of the collection in Qdrant. It is convenient to get it in advance by calling the preprocessing_data_for_rag tool from proxy_mcp_server and passing a list of files to index; the method will return the name of the created collection.

Example:

python ".\rag_inference\RAG workflow.py" collection_for_rag_1

MCP Registry