Scalene-MCP
A FastMCP v2 server providing LLMs with structured access to Scalene's comprehensive CPU, GPU, and memory profiling capabilities for Python packages and C/C++ bindings.
Installation
Prerequisites
- Python 3.10+
- uv (recommended) or pip
From Source
git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync
As a Package
pip install scalene-mcp
Quick Start: Running the Server
Development Mode
# Using uv
uv run scalene_mcp.server
# Using pip
python -m scalene_mcp.server
Production Mode
python -m scalene_mcp.server
🎯 Native Integration with LLM Agents
Works seamlessly with:
- ✅ GitHub Copilot - Direct integration
- ✅ Claude Code - Claude Code and Claude VSCode extension
- ✅ Cursor - All-in-one IDE
- ✅ Any MCP-compatible LLM client
Zero-Friction Setup (3 Steps)
-
Install
pip install scalene-mcp -
Configure - Choose one method:
Automated (Recommended):
python scripts/setup_vscode.pyInteractive setup script auto-finds your editor and configures it.
Manual - GitHub Copilot:
// .vscode/settings.json { "github.copilot.chat.mcp.servers": { "scalene": { "command": "uv", "args": ["run", "-m", "scalene_mcp.server"] } } }Manual - Claude Code / Cursor: See editor-specific setup guides
-
Restart VSCode/Cursor and start profiling!
Start Profiling Immediately
Open any Python project and ask your LLM:
"Profile main.py and show me the bottlenecks"
The LLM automatically:
- 🔍 Detects your project structure
- 📄 Finds and profiles your code
- 📊 Analyzes CPU, memory, GPU usage
- 💡 Suggests optimizations
No path thinking. No manual configuration. Zero friction.
📚 Editor-Specific Setup:
📚 Full docs: SETUP_VSCODE.md | QUICKSTART.md | TOOLS_REFERENCE.md
Available Serving Methods (FastMCP)
Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:
1. Standard Server (Default)
# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server
2. With Claude Desktop
Configure in your claude_desktop_config.json:
{
"mcpServers": {
"scalene": {
"command": "python",
"args": ["-m", "scalene_mcp.server"]
}
}
}
Then restart Claude Desktop.
3. With HTTP/SSE Endpoint
# If using fastmcp with HTTP support
uv run --help # Check FastMCP documentation for HTTP serving
4. With Environment Variables
# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server
5. Programmatically
from fastmcp import Server
# Create and run server programmatically
server = create_scalene_server()
# Configure and start...
Programmatic Usage
Use Scalene-MCP directly in your Python code:
from scalene_mcp.profiler import ScaleneProfiler
import asyncio
async def main():
profiler = ScaleneProfiler()
# Profile a script
result = await profiler.profile(
type="script",
script_path="fibonacci.py",
include_memory=True,
include_gpu=False
)
print(f"Profile ID: {result['profile_id']}")
print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
asyncio.run(main())
Overview
Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.
What Scalene-MCP Does
- ✅ Profile Python scripts with full Scalene feature set
- ✅ Analyze profiles for hotspots, bottlenecks, memory leaks
- ✅ Compare profiles to detect regressions
- ✅ Pass arguments to profiled scripts
- ✅ Structured output in JSON format for LLMs
- ✅ Async execution for non-blocking profiling
What Scalene-MCP Doesn't Do
- ❌ In-process profiling (
Scalene.start()/stop()) - uses subprocess instead for isolation - ❌ Process attachment (
--pidbased profiling) - profiles scripts, not running processes - ❌ Single-function profiling - designed for complete script analysis
Note: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See SCALENE_MODES_ANALYSIS.md for detailed scope analysis.
Key Features
- Complete CPU profiling: Line-by-line Python/C time, system time, CPU utilization
- Memory profiling: Peak/average memory per line, leak detection with velocity metrics
- GPU profiling: NVIDIA and Apple GPU support with per-line attribution
- Advanced analysis: Stack traces, bottleneck identification, performance recommendations
- Profile comparison: Track performance changes across runs
- LLM-optimized: Structured JSON output, summaries before details, context-aware formatting
Available Tools (7 Consolidated Tools)
Scalene-MCP provides a clean, LLM-optimized set of 7 tools:
Discovery (3 tools)
- get_project_root() - Auto-detect project structure
- list_project_files(pattern, max_depth) - Find files by glob pattern
- set_project_context(project_root) - Override auto-detection
Profiling (1 unified tool)
- profile(type, script_path/code, ...) - Profile scripts or code snippets
type="script"for script profilingtype="code"for code snippet profiling
Analysis (1 mega tool)
- analyze(profile_id, metric_type, ...) - 9 analysis modes in one tool:
metric_type="all"- Comprehensive analysismetric_type="cpu"- CPU hotspotsmetric_type="memory"- Memory hotspotsmetric_type="gpu"- GPU hotspotsmetric_type="bottlenecks"- Performance bottlenecksmetric_type="leaks"- Memory leak detectionmetric_type="file"- File-level metricsmetric_type="functions"- Function-level metricsmetric_type="recommendations"- Optimization suggestions
Comparison & Storage (2 tools)
- compare_profiles(before_id, after_id) - Compare two profiles
- list_profiles() - View all captured profiles
Full reference: See TOOLS_REFERENCE.md
Configuration
Profiling Options
The unified profile() tool supports these options:
| Option | Type | Default | Description |
|---|---|---|---|
type | str | required | "script" or "code" |
script_path | str | None | Required if type="script" |
code | str | None | Required if type="code" |
include_memory | bool | true | Profile memory |
include_gpu | bool | false | Profile GPU usage |
cpu_only | bool | false | Skip memory/GPU profiling |
reduced_profile | bool | false | Only report high-activity lines |
cpu_percent_threshold | float | 1.0 | Minimum CPU% to report |
malloc_threshold | int | 100 | Minimum allocation size (bytes) |
profile_only | str | "" | Profile only paths containing this |
profile_exclude | str | "" | Exclude paths containing this |
use_virtual_time | bool | false | Use virtual time instead of wall time |
script_args | list | [] | Command-line arguments for the script |
Environment Variables
SCALENE_CPU_PERCENT_THRESHOLD: Override default CPU thresholdSCALENE_MALLOC_THRESHOLD: Override default malloc threshold
Architecture
Components
- ScaleneProfiler: Async wrapper around Scalene CLI
- ProfileParser: Converts Scalene JSON to structured models
- ProfileAnalyzer: Extracts insights and hotspots
- ProfileComparator: Compares profiles for regressions
- FastMCP Server: Exposes tools via MCP protocol
Data Flow
Python Script
↓
ScaleneProfiler (subprocess)
↓
Scalene CLI (--json)
↓
Temp JSON File
↓
ProfileParser
↓
Pydantic Models (ProfileResult)
↓
Analyzer / Comparator
↓
MCP Tools
↓
LLM Client
Troubleshooting
GPU Permission Error
If you see PermissionError when profiling with GPU:
# Disable GPU profiling in test environments
result = await profiler.profile(
type="script",
script_path="script.py",
include_gpu=False
)
Profile Not Found
Profiles are stored in memory during the server session. For persistence, implement the storage interface.
Timeout Issues
Adjust the timeout parameter (if using profiler directly):
result = await profiler.profile(
type="script",
script_path="slow_script.py"
)
Development
Running Tests
# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp
# Specific test file
uv run pytest tests/test_profiler.py -v
# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html
Code Quality
# Type checking
uv run mypy src/
# Linting
uv run ruff check src/
# Formatting
uv run ruff format src/
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass and coverage ≥ 85%
- Submit a pull request
License
MIT License - see LICENSE file for details.
Citation
If you use Scalene-MCP in research, please cite both this project and Scalene:
@software{scalene_mcp,
title={Scalene-MCP: LLM-Friendly Profiling Server},
year={2026}
}
@inproceedings{berger2020scalene,
title={Scalene: Scripting-Language Aware Profiling for Python},
author={Berger, Emery},
year={2020}
}
Support
- Issues: GitHub Issues for bug reports and feature requests
- Discussions: GitHub Discussions for questions and ideas
- Documentation: See
docs/directory
Made with ❤️ for the Python performance community.
Manual Installation
pip install -e .
Development
Prerequisites
- Python 3.10+
- uv (recommended) or pip
Setup
# Install dependencies
uv sync
# Run tests
just test
# Run tests with coverage
just test-cov
# Lint and format
just lint
just format
# Type check
just typecheck
# Full build (sync + lint + typecheck + test)
just build
Project Structure
scalene-mcp/
├── src/scalene_mcp/ # Main package
│ ├── server.py # FastMCP server with tools/resources/prompts
│ ├── models.py # Pydantic data models
│ ├── profiler.py # Scalene execution wrapper
│ ├── parser.py # JSON output parser
│ ├── analyzer.py # Analysis engine
│ ├── comparator.py # Profile comparison
│ ├── recommender.py # Optimization recommendations
│ ├── storage.py # Profile persistence
│ └── utils.py # Shared utilities
├── tests/ # Test suite (100% coverage goal)
│ ├── fixtures/ # Test data
│ │ ├── profiles/ # Sample profile outputs
│ │ └── scripts/ # Test Python scripts
│ └── conftest.py # Shared test fixtures
├── examples/ # Usage examples
├── docs/ # Documentation
├── pyproject.toml # Project configuration
├── justfile # Task runner commands
└── README.md # This file
Usage
Running the Server
# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py
# Production mode
fastmcp run src/scalene_mcp/server.py
# Install to MCP config
fastmcp install src/scalene_mcp/server.py
Example: Profile a Script
# Through MCP client
result = await client.call_tool(
"profile",
arguments={
"script_path": "my_script.py",
"cpu": True,
"memory": True,
"gpu": False,
}
)
Example: Analyze Results
# Get analysis and recommendations
analysis = await client.call_tool(
"analyze",
arguments={"profile_id": result["profile_id"]}
)
Testing
The project maintains 100% test coverage with comprehensive test suites:
# Run all tests
uv run pytest
# Run with coverage report
uv run pytest --cov=src --cov-report=html
# Run specific test file
uv run pytest tests/test_server.py
# Run with verbose output
uv run pytest -v
Test fixtures include:
- Sample profiling scripts (fibonacci, memory-intensive, leaky)
- Realistic Scalene JSON outputs
- Edge cases and error conditions
Code Quality
This project follows strict code quality standards:
- Type Safety: 100% mypy strict mode compliance
- Linting: ruff with comprehensive rules
- Testing: 100% coverage requirement
- Style: Sleek-modern documentation, minimal functional emoji usage
- Patterns: FastMCP best practices throughout
Development Phases
Current Status: Phase 1.1 - Project Setup ✓
Documentation
Editor Setup Guides:
- GitHub Copilot Setup - Using Copilot Chat with VSCode
- Claude Code Setup - Using Claude Code VSCode extension
- Cursor Setup - Using the Cursor IDE
- General VSCode Setup - General VSCode configuration
API & Usage:
- Tools Reference - Complete API documentation (7 tools)
- Quick Start - 3-step setup and basic workflows
- Examples - Real-world profiling examples
Development Roadmap
- Phase 1: Project Setup & Infrastructure ✓
- Phase 2: Core Data Models (In Progress)
- Phase 3: Profiler Integration
- Phase 4: Analysis & Insights
- Phase 5: Comparison Features
- Phase 6: Resources Implementation
- Phase 7: Prompts & Workflows
- Phase 8: Testing & Quality
- Phase 9: Documentation
- Phase 10: Polish & Release
See development-plan.md for detailed roadmap.
Contributing
Contributions are welcome! Please ensure:
- All tests pass (
just test) - Linting passes (
just lint) - Type checking passes (
just typecheck) - Code coverage remains at 100%
License
[License TBD]