MCP4DRL - Model Context Protocol for Deep Reinforcement Learning

MCP server that exposes a trained Deep Q-Network (DQN) agent for business process resource allocation through conversational interfaces. Makes "black box" RL systems transparent via natural language queries.

Features

Environment State Queries - View simulation state, waiting/active cases, resources
Q-Value Analysis - Inspect Q-values for all actions
Action Recommendations - Get agent's top choice with justification
Explainability - Detailed explanations of why actions are chosen
Heuristic Comparison - Compare with FIFO, SPT, EDF, LST baselines
Simulation Control - Step through episodes, reset, run full episodes

Installation

pip install -r requirements.txt

Requirements: Python 3.8+, TensorFlow 2.16+

Quick Start

Test locally

python -m mcp4drl.test_integration

Run MCP server

# Windows
run_server.bat

# Linux/Mac
chmod +x run_server.sh
./run_server.sh

Claude Desktop Integration

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "mcp4drl": {
      "command": "cmd.exe",
      "args": ["/c", "C:\\path\\to\\mcp4drl_repo\\run_server.bat"],
      "shell": true
    }
  }
}

Available MCP Tools

Tool	Description
`get_environment_state`	Current simulation state
`get_eligible_actions`	All possible actions with validity
`get_q_values`	Q-values for all actions
`get_recommended_action`	Agent's best action
`explain_action`	Detailed action explanation
`compare_with_heuristic`	Compare with FIFO/SPT/EDF/LST
`step_simulation`	Execute one step
`reset_simulation`	Reset to initial state
`run_episode`	Run full episode with policy

Project Structure

mcp4drl_repo/
├── mcp4drl/           # Main Python package
│   ├── core/          # Wrappers (simulator, agent)
│   ├── models/        # Pydantic schemas
│   └── tools/         # MCP tool implementations
├── simprocess/        # Business process simulation engine
├── data/              # Model and event log
└── mcp4drl_server.py  # Standalone launcher

Configuration

Environment variables (optional):

MCP4DRL_MODEL_PATH - Path to trained model (.h5)
MCP4DRL_EVENT_LOG - Path to XES event log
MCP4DRL_TRANSPORT - stdio (default) or sse

Context

Part of doctoral dissertation on intelligent automation of business process management. Demonstrates that RL systems can be made transparent through conversational interfaces.

License

Research prototype.

MCP4DRL