MCP Hub
Back to servers

mcplex

An MCP server that bridges local Ollama models and ChromaDB vector memory to MCP clients like Claude Code. It enables local text generation, vision-based image analysis, and semantic memory storage without requiring external API keys.

glama
Updated
Mar 8, 2026

mcplex

CI

MCP server for local AI models -- expose Ollama, embeddings, vision, and vector memory to Claude Code and other MCP clients.

License: MIT Python 3.10+ MCP


What is this?

mcplex is a Model Context Protocol server that bridges your local AI models to any MCP client. It gives Claude Code (or any MCP-compatible tool) direct access to:

  • Ollama models -- generate text, chat, and list available models
  • Embeddings -- generate vector embeddings via local embedding models
  • Vision -- analyze images and extract text using local vision models (LLaVA, etc.)
  • Vector memory -- store and semantically search text using ChromaDB

Everything runs locally. No API keys needed. No data leaves your machine.

Features

CategoryToolsDescription
Text GenerationgenerateOne-shot text generation with any Ollama model
ChatchatMulti-turn conversation with message history
EmbeddingsembedGenerate vector embeddings for text
Model Managementlist_modelsList all available Ollama models
Visionanalyze_imageDescribe/analyze images with a vision model
OCRocr_imageExtract text from images
Memory Storememory_storeStore text + metadata in ChromaDB
Memory Searchmemory_searchSemantic search over stored memories
Memory Listmemory_list_collectionsList all memory collections

Requirements

  • Python 3.10+
  • Ollama running locally (default: http://localhost:11434)
  • At least one Ollama model pulled (e.g., ollama pull qwen3:8b)

Installation

# From PyPI (when published)
pip install mcplex

# With vector memory support
pip install mcplex[memory]

# From source
git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"

Claude Code Integration

Add mcplex to your Claude Code MCP configuration:

{
  "mcpServers": {
    "mcplex": {
      "command": "mcplex",
      "args": []
    }
  }
}

Or if running from source:

{
  "mcpServers": {
    "mcplex": {
      "command": "python",
      "args": ["-m", "mcplex.server"]
    }
  }
}

Once configured, Claude Code can use your local models directly:

"Use the generate tool to summarize this file with qwen3:8b"

"Embed these three paragraphs and store them in the 'research' collection"

"Analyze this screenshot and extract all visible text"

Tool Reference

generate

Send a prompt to a local Ollama model.

ParameterTypeDefaultDescription
promptstrrequiredThe text prompt
modelstrqwen3:8bOllama model name
temperaturefloat0.7Sampling temperature (0.0-2.0)
max_tokensint2048Maximum tokens to generate

chat

Multi-turn chat with message history.

ParameterTypeDefaultDescription
messageslist[{role, content}]requiredMessage history
modelstrqwen3:8bOllama model name
temperaturefloat0.7Sampling temperature
max_tokensint2048Maximum tokens

embed

Generate vector embeddings.

ParameterTypeDefaultDescription
textstr | list[str]requiredText to embed
modelstrnomic-embed-textEmbedding model

list_models

List all available Ollama models. No parameters.

analyze_image

Analyze an image with a local vision model.

ParameterTypeDefaultDescription
image_pathstrrequiredPath to image file
promptstr"Describe this image in detail."Question/instruction
modelstrllavaVision model name

ocr_image

Extract text from an image.

ParameterTypeDefaultDescription
image_pathstrrequiredPath to image file
modelstrllavaVision model name

memory_store

Store text in vector memory.

ParameterTypeDefaultDescription
textstrrequiredText to store
metadatadictNoneOptional key-value metadata
collectionstr"default"ChromaDB collection name

memory_search

Semantic search over stored memories.

ParameterTypeDefaultDescription
querystrrequiredSearch query
n_resultsint5Max results to return
collectionstr"default"ChromaDB collection name

memory_list_collections

List all ChromaDB collections. No parameters.

Configuration

All configuration is via environment variables (or a .env file):

VariableDefaultDescription
MCPLEX_OLLAMA_URLhttp://localhost:11434Ollama server URL
MCPLEX_DEFAULT_MODELqwen3:8bDefault text model
MCPLEX_EMBED_MODELnomic-embed-textDefault embedding model
MCPLEX_VISION_MODELllavaDefault vision model
MCPLEX_CHROMA_PATH./mcplex_data/chromaChromaDB storage path
MCPLEX_DEFAULT_TEMPERATURE0.7Default sampling temperature
MCPLEX_DEFAULT_MAX_TOKENS2048Default max tokens

Architecture

MCP Client (Claude Code, etc.)
    |
    | stdio (JSON-RPC)
    |
mcplex server (FastMCP)
    |
    +-- ollama_tools -----> Ollama API (HTTP)
    |                        localhost:11434
    +-- vision_tools -----> Ollama API (with images)
    |
    +-- memory_tools -----> ChromaDB (local persistent)
  • Transport: stdio (standard for CLI-based MCP clients)
  • Ollama communication: async HTTP via httpx
  • Vector storage: ChromaDB with persistent client (lazy-loaded)
  • No API keys required -- everything runs locally

Development

git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"

# Run tests
python -m pytest tests/ -v

# Run the server
mcplex
# or
python -m mcplex.server

License

MIT -- Copyright (c) 2026 Donald Havery

Reviews

No reviews yet

Sign in to write a review