MCP Hub
Back to servers

MiniMind Docker

A containerized MCP server for the MiniMind language model, offering tools for AI-powered chat, GPU resource management, and model information retrieval.

Stars
1
Tools
5
Updated
Jan 3, 2026

English | 简体中文 | 繁體中文 | 日本語

🧠 MiniMind Docker

Docker License Python CUDA

All-in-One Docker deployment for MiniMind LLM with UI, API & MCP support

Live Demo · API Docs · Original Project


✨ Features

  • 🐳 One-Click Docker Deployment - All dependencies bundled, ready to run
  • 🎨 Modern Web UI - Responsive design with dark mode & multi-language support
  • 🔌 OpenAI-Compatible API - Drop-in replacement for existing applications
  • 🤖 MCP Integration - Model Context Protocol for AI agent workflows
  • 🎮 Smart GPU Management - Auto-select idle GPU, auto-release memory
  • 📊 Real-time Streaming - SSE-based streaming responses
  • 🌍 Multi-language UI - English, 简体中文, 繁體中文, 日本語

🚀 Quick Start

Docker (Recommended)

# Pull and run
docker run -d --gpus all -p 8998:8998 neosun/minimind:latest

# Access
# UI: http://localhost:8998
# API: http://localhost:8998/v1/chat/completions
# Docs: http://localhost:8998/apidocs/

Docker Compose

git clone https://github.com/neosu/minimind-docker.git
cd minimind-docker
./start.sh

📦 Installation

Prerequisites

  • Docker 20.10+
  • Docker Compose 2.0+
  • NVIDIA GPU with CUDA 12.1+ (optional, CPU fallback available)
  • nvidia-container-toolkit (for GPU support)

Method 1: Docker Run

# Basic (CPU)
docker run -d -p 8998:8998 neosun/minimind:latest

# With GPU
docker run -d --gpus all -p 8998:8998 neosun/minimind:latest

# With custom model path
docker run -d --gpus all -p 8998:8998 \
  -v /path/to/models:/app/models \
  -e MODEL_PATH=/app/models/MiniMind2 \
  neosun/minimind:latest

Method 2: Docker Compose

# docker-compose.yml
services:
  minimind:
    image: neosun/minimind:latest
    ports:
      - "8998:8998"
    environment:
      - NVIDIA_VISIBLE_DEVICES=0
      - GPU_IDLE_TIMEOUT=60
      - MODEL_PATH=MiniMind2-Small
    volumes:
      - /tmp/minimind:/app/uploads
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
docker compose up -d

Method 3: Local Development

# Clone
git clone https://github.com/neosu/minimind-docker.git
cd minimind-docker

# Install dependencies
pip install -r requirements.txt

# Download model
python -c "from huggingface_hub import snapshot_download; snapshot_download('jingyaogong/MiniMind2-Small', local_dir='MiniMind2-Small')"

# Run
python app.py

⚙️ Configuration

Environment Variables

VariableDefaultDescription
PORT8998Server port
MODEL_PATHMiniMind2-SmallModel path or HuggingFace ID
GPU_IDLE_TIMEOUT60Seconds before auto-releasing GPU memory
NVIDIA_VISIBLE_DEVICES0GPU device ID
MAX_SEQ_LEN8192Maximum sequence length
TEMPERATURE0.85Default generation temperature

.env Example

PORT=8998
GPU_IDLE_TIMEOUT=60
NVIDIA_VISIBLE_DEVICES=0
MODEL_PATH=MiniMind2-Small

📖 Usage

Web UI

Visit http://localhost:8998 for the interactive chat interface.

Features:

  • Adjustable parameters (Temperature, Max Tokens, Top P)
  • GPU status monitoring
  • One-click GPU memory release
  • Multi-language support (EN/CN/TW/JP)
  • Dark mode support

REST API

Chat Completion (OpenAI Compatible)

curl -X POST http://localhost:8998/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimind",
    "messages": [{"role": "user", "content": "Hello!"}],
    "temperature": 0.7,
    "max_tokens": 512,
    "stream": false
  }'

Streaming Response

curl -X POST http://localhost:8998/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimind",
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'

GPU Status

# Check status
curl http://localhost:8998/api/gpu/status

# Release GPU memory
curl -X POST http://localhost:8998/api/gpu/offload

MCP Integration

Configure in your MCP client:

{
  "mcpServers": {
    "minimind": {
      "command": "python",
      "args": ["mcp_server.py"],
      "env": {
        "MODEL_PATH": "MiniMind2-Small",
        "GPU_IDLE_TIMEOUT": "600"
      }
    }
  }
}

Available Tools:

  • chat - Single-turn conversation
  • multi_turn_chat - Multi-turn conversation
  • get_gpu_status - Query GPU status
  • get_model_info - Get model information
  • release_gpu - Release GPU memory

See MCP_GUIDE.md for detailed documentation.


🔌 API Reference

EndpointMethodDescription
/GETWeb UI
/healthGETHealth check
/api/gpu/statusGETGPU status
/api/gpu/offloadPOSTRelease GPU memory
/v1/chat/completionsPOSTChat API (OpenAI compatible)
/apidocs/GETSwagger documentation

📁 Project Structure

minimind-docker/
├── app.py              # Main application (UI + API)
├── mcp_server.py       # MCP server
├── Dockerfile          # Docker build file
├── docker-compose.yml  # Docker Compose config
├── start.sh           # One-click start script
├── requirements.txt    # Python dependencies
├── .env.example       # Environment template
├── MCP_GUIDE.md       # MCP documentation
├── model/             # Tokenizer files
├── trainer/           # Training scripts
└── scripts/           # Utility scripts

🛠️ Tech Stack

  • Framework: Flask + FastMCP
  • Model: MiniMind2 (Transformer-based LLM)
  • GPU: CUDA 12.1 + PyTorch 2.6
  • Container: Docker + nvidia-container-toolkit
  • API: OpenAI-compatible REST API
  • Docs: Swagger/Flasgger

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 Changelog

v1.0.0 (2026-01-04)

  • 🎉 Initial release
  • 🐳 Docker all-in-one deployment
  • 🎨 Web UI with multi-language support
  • 🔌 OpenAI-compatible API
  • 🤖 MCP integration
  • 🎮 Smart GPU management

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Based on MiniMind by Jingyao Gong.


⭐ Star History

Star History Chart

📱 Follow Us

WeChat

Reviews

No reviews yet

Sign in to write a review