MCP Hub
Back to servers

scalene-mcp

A FastMCP server providing structured access to Scalene's CPU, GPU, and memory profiling for Python projects, featuring automated bottleneck analysis and profile comparisons.

Stars
1
Tools
7
Updated
Jan 17, 2026
Validated
Mar 9, 2026

Scalene-MCP

A FastMCP v2 server providing LLMs with structured access to Scalene's comprehensive CPU, GPU, and memory profiling capabilities for Python packages and C/C++ bindings.

Installation

Prerequisites

  • Python 3.10+
  • uv (recommended) or pip

From Source

git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync

As a Package

pip install scalene-mcp

Quick Start: Running the Server

Development Mode

# Using uv
uv run scalene_mcp.server

# Using pip
python -m scalene_mcp.server

Production Mode

python -m scalene_mcp.server

🎯 Native Integration with LLM Agents

Works seamlessly with:

  • GitHub Copilot - Direct integration
  • Claude Code - Claude Code and Claude VSCode extension
  • Cursor - All-in-one IDE
  • Any MCP-compatible LLM client

Zero-Friction Setup (3 Steps)

  1. Install

    pip install scalene-mcp
    
  2. Configure - Choose one method:

    Automated (Recommended):

    python scripts/setup_vscode.py
    

    Interactive setup script auto-finds your editor and configures it.

    Manual - GitHub Copilot:

    // .vscode/settings.json
    {
      "github.copilot.chat.mcp.servers": {
        "scalene": {
          "command": "uv",
          "args": ["run", "-m", "scalene_mcp.server"]
        }
      }
    }
    

    Manual - Claude Code / Cursor: See editor-specific setup guides

  3. Restart VSCode/Cursor and start profiling!

Start Profiling Immediately

Open any Python project and ask your LLM:

"Profile main.py and show me the bottlenecks"

The LLM automatically:

  • 🔍 Detects your project structure
  • 📄 Finds and profiles your code
  • 📊 Analyzes CPU, memory, GPU usage
  • 💡 Suggests optimizations

No path thinking. No manual configuration. Zero friction.

📚 Editor-Specific Setup:

📚 Full docs: SETUP_VSCODE.md | QUICKSTART.md | TOOLS_REFERENCE.md

Available Serving Methods (FastMCP)

Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:

1. Standard Server (Default)

# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server

2. With Claude Desktop

Configure in your claude_desktop_config.json:

{
  "mcpServers": {
    "scalene": {
      "command": "python",
      "args": ["-m", "scalene_mcp.server"]
    }
  }
}

Then restart Claude Desktop.

3. With HTTP/SSE Endpoint

# If using fastmcp with HTTP support
uv run --help  # Check FastMCP documentation for HTTP serving

4. With Environment Variables

# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server

5. Programmatically

from fastmcp import Server

# Create and run server programmatically
server = create_scalene_server()
# Configure and start...

Programmatic Usage

Use Scalene-MCP directly in your Python code:

from scalene_mcp.profiler import ScaleneProfiler
import asyncio

async def main():
    profiler = ScaleneProfiler()
    
    # Profile a script
    result = await profiler.profile(
        type="script",
        script_path="fibonacci.py",
        include_memory=True,
        include_gpu=False
    )
    
    print(f"Profile ID: {result['profile_id']}")
    print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
    
asyncio.run(main())

Overview

Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.

What Scalene-MCP Does

  • Profile Python scripts with full Scalene feature set
  • Analyze profiles for hotspots, bottlenecks, memory leaks
  • Compare profiles to detect regressions
  • Pass arguments to profiled scripts
  • Structured output in JSON format for LLMs
  • Async execution for non-blocking profiling

What Scalene-MCP Doesn't Do

  • In-process profiling (Scalene.start()/stop()) - uses subprocess instead for isolation
  • Process attachment (--pid based profiling) - profiles scripts, not running processes
  • Single-function profiling - designed for complete script analysis

Note: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See SCALENE_MODES_ANALYSIS.md for detailed scope analysis.

Key Features

  • Complete CPU profiling: Line-by-line Python/C time, system time, CPU utilization
  • Memory profiling: Peak/average memory per line, leak detection with velocity metrics
  • GPU profiling: NVIDIA and Apple GPU support with per-line attribution
  • Advanced analysis: Stack traces, bottleneck identification, performance recommendations
  • Profile comparison: Track performance changes across runs
  • LLM-optimized: Structured JSON output, summaries before details, context-aware formatting

Available Tools (7 Consolidated Tools)

Scalene-MCP provides a clean, LLM-optimized set of 7 tools:

Discovery (3 tools)

  • get_project_root() - Auto-detect project structure
  • list_project_files(pattern, max_depth) - Find files by glob pattern
  • set_project_context(project_root) - Override auto-detection

Profiling (1 unified tool)

  • profile(type, script_path/code, ...) - Profile scripts or code snippets
    • type="script" for script profiling
    • type="code" for code snippet profiling

Analysis (1 mega tool)

  • analyze(profile_id, metric_type, ...) - 9 analysis modes in one tool:
    • metric_type="all" - Comprehensive analysis
    • metric_type="cpu" - CPU hotspots
    • metric_type="memory" - Memory hotspots
    • metric_type="gpu" - GPU hotspots
    • metric_type="bottlenecks" - Performance bottlenecks
    • metric_type="leaks" - Memory leak detection
    • metric_type="file" - File-level metrics
    • metric_type="functions" - Function-level metrics
    • metric_type="recommendations" - Optimization suggestions

Comparison & Storage (2 tools)

  • compare_profiles(before_id, after_id) - Compare two profiles
  • list_profiles() - View all captured profiles

Full reference: See TOOLS_REFERENCE.md

Configuration

Profiling Options

The unified profile() tool supports these options:

OptionTypeDefaultDescription
typestrrequired"script" or "code"
script_pathstrNoneRequired if type="script"
codestrNoneRequired if type="code"
include_memorybooltrueProfile memory
include_gpuboolfalseProfile GPU usage
cpu_onlyboolfalseSkip memory/GPU profiling
reduced_profileboolfalseOnly report high-activity lines
cpu_percent_thresholdfloat1.0Minimum CPU% to report
malloc_thresholdint100Minimum allocation size (bytes)
profile_onlystr""Profile only paths containing this
profile_excludestr""Exclude paths containing this
use_virtual_timeboolfalseUse virtual time instead of wall time
script_argslist[]Command-line arguments for the script

Environment Variables

  • SCALENE_CPU_PERCENT_THRESHOLD: Override default CPU threshold
  • SCALENE_MALLOC_THRESHOLD: Override default malloc threshold

Architecture

Components

  • ScaleneProfiler: Async wrapper around Scalene CLI
  • ProfileParser: Converts Scalene JSON to structured models
  • ProfileAnalyzer: Extracts insights and hotspots
  • ProfileComparator: Compares profiles for regressions
  • FastMCP Server: Exposes tools via MCP protocol

Data Flow

Python Script
    ↓
ScaleneProfiler (subprocess)
    ↓
Scalene CLI (--json)
    ↓
Temp JSON File
    ↓
ProfileParser
    ↓
Pydantic Models (ProfileResult)
    ↓
Analyzer / Comparator
    ↓
MCP Tools
    ↓
LLM Client

Troubleshooting

GPU Permission Error

If you see PermissionError when profiling with GPU:

# Disable GPU profiling in test environments
result = await profiler.profile(
    type="script",
    script_path="script.py",
    include_gpu=False
)

Profile Not Found

Profiles are stored in memory during the server session. For persistence, implement the storage interface.

Timeout Issues

Adjust the timeout parameter (if using profiler directly):

result = await profiler.profile(
    type="script",
    script_path="slow_script.py"
)

Development

Running Tests

# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp

# Specific test file
uv run pytest tests/test_profiler.py -v

# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html

Code Quality

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Formatting
uv run ruff format src/

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass and coverage ≥ 85%
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

If you use Scalene-MCP in research, please cite both this project and Scalene:

@software{scalene_mcp,
  title={Scalene-MCP: LLM-Friendly Profiling Server},
  year={2026}
}

@inproceedings{berger2020scalene,
  title={Scalene: Scripting-Language Aware Profiling for Python},
  author={Berger, Emery},
  year={2020}
}

Support

  • Issues: GitHub Issues for bug reports and feature requests
  • Discussions: GitHub Discussions for questions and ideas
  • Documentation: See docs/ directory

Made with ❤️ for the Python performance community.

Manual Installation

pip install -e .

Development

Prerequisites

  • Python 3.10+
  • uv (recommended) or pip

Setup

# Install dependencies
uv sync

# Run tests
just test

# Run tests with coverage
just test-cov

# Lint and format
just lint
just format

# Type check
just typecheck

# Full build (sync + lint + typecheck + test)
just build

Project Structure

scalene-mcp/
├── src/scalene_mcp/     # Main package
│   ├── server.py        # FastMCP server with tools/resources/prompts
│   ├── models.py        # Pydantic data models
│   ├── profiler.py      # Scalene execution wrapper
│   ├── parser.py        # JSON output parser
│   ├── analyzer.py      # Analysis engine
│   ├── comparator.py    # Profile comparison
│   ├── recommender.py   # Optimization recommendations
│   ├── storage.py       # Profile persistence
│   └── utils.py         # Shared utilities
├── tests/               # Test suite (100% coverage goal)
│   ├── fixtures/        # Test data
│   │   ├── profiles/    # Sample profile outputs
│   │   └── scripts/     # Test Python scripts
│   └── conftest.py      # Shared test fixtures
├── examples/            # Usage examples
├── docs/                # Documentation
├── pyproject.toml       # Project configuration
├── justfile             # Task runner commands
└── README.md            # This file

Usage

Running the Server

# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py

# Production mode
fastmcp run src/scalene_mcp/server.py

# Install to MCP config
fastmcp install src/scalene_mcp/server.py

Example: Profile a Script

# Through MCP client
result = await client.call_tool(
    "profile",
    arguments={
        "script_path": "my_script.py",
        "cpu": True,
        "memory": True,
        "gpu": False,
    }
)

Example: Analyze Results

# Get analysis and recommendations
analysis = await client.call_tool(
    "analyze",
    arguments={"profile_id": result["profile_id"]}
)

Testing

The project maintains 100% test coverage with comprehensive test suites:

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src --cov-report=html

# Run specific test file
uv run pytest tests/test_server.py

# Run with verbose output
uv run pytest -v

Test fixtures include:

  • Sample profiling scripts (fibonacci, memory-intensive, leaky)
  • Realistic Scalene JSON outputs
  • Edge cases and error conditions

Code Quality

This project follows strict code quality standards:

  • Type Safety: 100% mypy strict mode compliance
  • Linting: ruff with comprehensive rules
  • Testing: 100% coverage requirement
  • Style: Sleek-modern documentation, minimal functional emoji usage
  • Patterns: FastMCP best practices throughout

Development Phases

Current Status: Phase 1.1 - Project Setup

Documentation

Editor Setup Guides:

API & Usage:

Development Roadmap

  1. Phase 1: Project Setup & Infrastructure ✓
  2. Phase 2: Core Data Models (In Progress)
  3. Phase 3: Profiler Integration
  4. Phase 4: Analysis & Insights
  5. Phase 5: Comparison Features
  6. Phase 6: Resources Implementation
  7. Phase 7: Prompts & Workflows
  8. Phase 8: Testing & Quality
  9. Phase 9: Documentation
  10. Phase 10: Polish & Release

See development-plan.md for detailed roadmap.

Contributing

Contributions are welcome! Please ensure:

  • All tests pass (just test)
  • Linting passes (just lint)
  • Type checking passes (just typecheck)
  • Code coverage remains at 100%

License

[License TBD]

Links

Reviews

No reviews yet

Sign in to write a review