AI Image MCP Server
A comprehensive Model Context Protocol (MCP) server that provides both AI-powered image analysis and AI image generation capabilities using OpenAI's Vision API and image generation models.
System Requirements
Tested on:
- macOS 14.3.0 (Darwin 23.3.0, ARM64)
- Python 3.13.0
- uv 0.7.13
- OpenAI API access
Features
🔍 Image Analysis & Description
- Smart Image Analysis: Analyze images using OpenAI's GPT-4O Vision model
- Targeted Analysis: Analyze specific aspects (objects, text, colors, composition, emotions)
- Image Comparisons: Compare two images and highlight similarities/differences
- Metadata Extraction: Get technical information about image files
- Intelligent Caching: Cache analysis results to avoid repeated API calls
- Multiple Formats: Support for PNG, JPEG, GIF, and WebP formats
🎨 Image Generation & Editing
- Text-to-Image Generation: Create images from text prompts using DALL-E 2, DALL-E 3, or GPT-Image-1
- Image Editing: Edit existing images with text prompts using GPT-Image-1 or DALL-E 2
- Image Variations: Create variations of existing images using DALL-E 2
- Flexible Output: Save generated images locally with custom naming and directories
- Model Support: Full support for all OpenAI image generation models with their specific features
MCP Tools
describe_image(image_path, prompt)- Get detailed image descriptionsanalyze_image_content(image_path, analysis_type)- Analyze specific aspectscompare_images(image1_path, image2_path, comparison_focus)- Compare two imagesget_image_metadata(image_path)- Extract technical metadataget_cache_info()- View cache statisticsclear_image_cache()- Clear cached results
Installation
- Install dependencies:
curl -LsSf https://astral.sh/uv/install.sh | sh
uv add mcp[cli] openai pillow requests
- Set your OpenAI API key:
export OPENAI_API_KEY="your-api-key-here"
- Run the server:
uv run main.py
Running the Server
uv run main.py
MCP Integration
Claude Desktop
{
"mcpServers": {
"ai-image-mcp": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/ai-image-mcp",
"run",
"main.py"
],
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}
Cursor
Configure MCP in Cursor settings:
{
"servers": {
"ai-image-mcp": {
"command": "uv",
"args": ["run", "main.py"],
"cwd": "/absolute/path/to/ai-image-mcp",
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}
Analysis Types
general: Overall image descriptionobjects: Object detection and identificationtext: Text extraction and OCRcolors: Color analysis and palettecomposition: Visual composition and layoutemotions: Emotional content and mood
Project Structure
ai-image-mcp/
├── test_data/ # Sample images (gitignored)
├── tools/ # MCP tool definitions
├── utils/ # Utilities (caching, OpenAI client)
├── main.py # Server entry point
└── server.py # MCP server instance
Caching
- Automatic file change detection via SHA-256 hashes
- 30-day cache expiration
- Separate cache entries for different prompts/analysis types
- Significant performance improvements (1000x+ faster than API calls)
Available Tools
Image Analysis Tools
describe_image
Analyze an image and provide a detailed description.
- Parameters:
image_path(str): Path to the image fileprompt(str, optional): Custom analysis prompt
- Supports: PNG, JPEG, GIF, WebP
- Features: Caching, file validation, comprehensive error handling
analyze_image_content
Perform targeted analysis of specific image aspects.
- Parameters:
image_path(str): Path to the image fileanalysis_type(str): Type of analysis - "general", "objects", "text", "colors", "composition", "emotions"
- Features: Specialized prompts for different analysis types
compare_images
Compare two images and highlight similarities and differences.
- Parameters:
image1_path(str): Path to first imageimage2_path(str): Path to second imagecomparison_focus(str): What to focus on in comparison
get_image_metadata
Get technical metadata about an image file.
- Returns: File size, dimensions, format, color mode, aspect ratio, etc.
Image Generation Tools
generate_image
Generate images from text prompts using OpenAI's image generation models.
- Parameters:
prompt(str): Text description of desired imagemodel(str): "dall-e-2", "dall-e-3", or "gpt-image-1" (default: dall-e-3)size(str, optional): Image dimensions (varies by model)quality(str, optional): Quality setting (varies by model)style(str, optional): "vivid" or "natural" (DALL-E 3 only)n(int, optional): Number of images (1-10, DALL-E 3 only supports 1)output_dir(str): Directory to save images (default: "./generated_images")filename_prefix(str): Prefix for filenames (default: "generated")
Model-Specific Features:
- DALL-E 2: Basic generation, sizes: 256x256, 512x512, 1024x1024
- DALL-E 3: High quality, styles (vivid/natural), sizes: 1024x1024, 1792x1024, 1024x1792
- GPT-Image-1: Advanced features, transparency support, compression control
edit_image
Edit existing images using text prompts.
- Parameters:
image_path(str): Path to image to editprompt(str): Description of desired editmask_path(str, optional): Path to mask image (PNG with transparent edit areas)model(str): "gpt-image-1" or "dall-e-2" (default: gpt-image-1)size,quality,n: Model-specific optionsoutput_dir,filename_prefix: Output configuration
Supported Models: GPT-Image-1 (up to 16 images, 50MB each) and DALL-E 2 (1 square PNG, 4MB max)
create_image_variations
Create variations of existing images using DALL-E 2.
- Parameters:
image_path(str): Path to source image (must be square PNG, <4MB)n(int): Number of variations (1-10, default: 2)size(str): Variation size - "256x256", "512x512", "1024x1024"output_dir,filename_prefix: Output configuration
list_generated_images
List all generated images in a directory with metadata.
- Parameters:
directory(str): Directory to scan (default: "./generated_images")
- Returns: File listing with sizes, dimensions, modification dates
Cache Management Tools
get_cache_info
Get information about the analysis cache (file count, size, location).
clear_image_cache
Clear all cached analysis results.
Model Comparison
| Feature | DALL-E 2 | DALL-E 3 | GPT-Image-1 |
|---|---|---|---|
| Generation | ✅ Basic | ✅ High Quality | ✅ Advanced |
| Editing | ✅ Limited | ❌ | ✅ Advanced |
| Variations | ✅ | ❌ | ❌ |
| Max Images | 10 | 1 | 10 |
| Sizes | 256x256, 512x512, 1024x1024 | 1024x1024, 1792x1024, 1024x1792 | 1024x1024, 1536x1024, 1024x1536 |
| Styles | ❌ | vivid, natural | ❌ |
| Quality | standard | standard, hd | auto, high, medium, low |
| Transparency | ❌ | ❌ | ✅ |
| Max Prompt | 1000 chars | 4000 chars | 32000 chars |
Usage Examples
Generate a Basic Image
# Generate an image with DALL-E 3
generate_image(
prompt="A serene mountain landscape at sunset with a crystal clear lake",
model="dall-e-3",
size="1792x1024",
quality="hd",
style="natural"
)
Edit an Existing Image
# Add elements to an image
edit_image(
image_path="./photos/room.png",
prompt="Add a beautiful bookshelf filled with colorful books to the left wall",
model="gpt-image-1",
quality="high"
)
Create Image Variations
# Create variations of a logo
create_image_variations(
image_path="./logos/logo.png",
n=5,
size="1024x1024"
)
Analyze Generated Images
# Analyze a generated image
describe_image(
image_path="./generated_images/generated_1234567890_1.png",
prompt="Describe the artistic style and composition of this generated image"
)
File Organization
Generated images are automatically organized in separate directories:
./generated_images/- Text-to-image generations./edited_images/- Image edits./image_variations/- Image variations
Files are named with timestamps to avoid conflicts:
generated_1234567890_1.pngedited_1234567890_1.pngvariation_1234567890_1.png
Error Handling
The server includes comprehensive error handling for:
- Invalid image formats and file paths
- Model-specific parameter validation
- File size and dimension limits
- API quota and rate limiting
- Network connectivity issues
- Malformed prompts and parameters
Cache System
The analysis tools use an intelligent caching system:
- File Change Detection: Uses SHA-256 hashes to detect file changes
- 30-Day Expiration: Automatically expires old cache entries
- Safe Operation: Cache failures don't affect main functionality
- Efficient Storage: Uses MD5 hashes for safe cache key generation
Requirements
- Python 3.13+
- OpenAI API key with access to Vision API and Image Generation
- Required packages:
mcp[cli]>=1.9.4,openai>=1.90.0,pillow>=11.2.1,requests>=2.32.4
License
This project is licensed under the MIT License - see the LICENSE file for details.