MCP Hub
Back to servers

mcp-link-scan

A tool for scanning and summarizing web content, featuring specialized support for YouTube and Instagram Reels through audio transcription (Whisper) and text summarization (Llama3 via Ollama).

Tools
2
Updated
Jan 6, 2026
Validated
Jan 9, 2026

Link Scan MCP Server 🚀

링크를 스캔하고 요약을 제공하는 포괄적인 Model Context Protocol (MCP) 서버입니다. YouTube, Instagram Reels 등 비디오 링크와 블로그, 기사 등 텍스트 링크를 자동으로 감지하고 분석하여 3문장 이내의 간결한 요약을 제공합니다. API 키 없이 모든 기능을 사용할 수 있습니다!

Link Scan MCP Server - A comprehensive Model Context Protocol (MCP) server for scanning and summarizing links. Automatically detects and analyzes video links (YouTube, Instagram Reels) and text links (blogs, articles) to provide concise 3-sentence summaries. All features work without requiring API keys!

Python 3.11+ | MCP Compatible | License: MIT

✨ Features

🎥 Video Link Analysis

  • YouTube Support
    • Comprehensive metadata extraction (title, description)
    • Subtitle extraction for first 7 seconds (yt-dlp)
    • Audio transcription using OpenAI Whisper
    • Integrated summarization combining all text sources
  • Instagram Reels Support
    • Audio download and transcription (first 7 seconds)
    • Automatic content summarization
  • Smart Link Detection
    • Automatic video/text link type detection
    • Error handling for unsupported URLs

📝 Text Link Analysis

  • Web Content Extraction
    • BeautifulSoup-based HTML parsing
    • Main content area detection
    • Automatic navigation/ad removal
  • Intelligent Summarization
    • Llama3-powered text summarization
    • 3-sentence limit enforcement
    • Natural Korean output

🤖 AI-Powered Summarization

  • Llama3 Integration
    • Local LLM via Ollama (no API keys required)
    • Separate prompts for video and text content
    • Fallback to original text on errors
  • Whisper Transcription
    • High-quality speech-to-text conversion
    • Optimized for speed and accuracy
    • Supports multiple languages

🐳 Docker Support

  • One-Command Setup
    • Docker Compose configuration
    • Automatic Ollama service setup
    • Llama3 model auto-download
    • Development mode with hot reload

🔧 Developer-Friendly

  • Type-safe with Pydantic models
  • Async/await support for better performance
  • Comprehensive error handling
  • Extensible architecture
  • Hot reload in development mode

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan

# Install dependencies
pip install -r requirements.txt

System Dependencies

ffmpeg (required for audio processing):

Ollama (required for summarization):

Configuration

Create a .env file:

# 서버 설정
PORT=8000                    # 서버 포트 (기본값: 8000)
HOST=0.0.0.0                 # 서버 호스트 (기본값: 0.0.0.0)
DEBUG=False                  # 디버그 모드 (기본값: False)

# API 경로 prefix (선택)
# 같은 서버에 여러 MCP 서버를 호스팅할 때 사용
# 기본값: /link-scan
API_PREFIX=/link-scan

# Ollama 설정 (선택)
# Docker Compose를 사용하는 경우 자동으로 설정됨
OLLAMA_API_URL=http://localhost:11434    # Ollama API URL (기본값: http://localhost:11434)
OLLAMA_MODEL=llama3:latest                # 사용할 Ollama 모델 (기본값: llama3)

환경 변수 설명

변수명필수기본값설명
PORT8000서버가 사용할 포트 번호
HOST0.0.0.0서버가 바인딩할 호스트 주소
DEBUGFalse디버그 모드 활성화 (True/False)
API_PREFIX/link-scanAPI 엔드포인트 경로 prefix
OLLAMA_API_URLhttp://localhost:11434Ollama API 서버 URL
OLLAMA_MODELllama3사용할 Ollama 모델 이름

Running as MCP Server

Local Mode (stdio):

python -m src.server

Remote Mode (HTTP):

python run_server.py

Or with uvicorn directly:

uvicorn src.server_http:app --host 0.0.0.0 --port 8000

Docker Setup (Recommended)

Using Docker Compose:

# Start all services (link-scan + Ollama)
docker-compose up -d

# Check logs
docker-compose logs -f

# Stop services
docker-compose down

Docker Compose automatically:

  • Sets up Ollama service with 8GB memory
  • Downloads Llama3 model
  • Configures link-scan service
  • Enables development mode with hot reload

Development Mode: The docker-compose.yml is configured for development with:

  • Source code volume mounting
  • Hot reload enabled (DEBUG=True)
  • Automatic code changes detection

Testing with MCP Inspector

You can test the server using the MCP Inspector tool:

# Test with Python
npx @modelcontextprotocol/inspector python run_server.py

# Or test stdio mode
npx @modelcontextprotocol/inspector python -m src.server

The MCP Inspector provides a web interface to:

  • View available tools and their schemas
  • Test tool execution with sample inputs
  • Debug server responses and error handling
  • Validate MCP protocol compliance

🛠️ Available Tools

1. scan_video_link

Scan and summarize video links (YouTube, Instagram Reels, etc.).

Parameters:

  • url (string, required): Video URL to scan

Example:

{
  "name": "scan_video_link",
  "arguments": {
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  }
}

Process:

  1. Detects link type (YouTube, Instagram, etc.)
  2. For YouTube: Extracts title, description, subtitles (first 7s)
  3. Downloads audio (first 7 seconds)
  4. Transcribes audio with Whisper
  5. Combines all text sources
  6. Summarizes with Llama3 (3 sentences max)

2. scan_text_link

Scan and summarize text links (blogs, articles, etc.).

Parameters:

  • url (string, required): Text URL to scan

Example:

{
  "name": "scan_text_link",
  "arguments": {
    "url": "https://example.com/blog/article"
  }
}

Process:

  1. Fetches HTML content
  2. Extracts main text content
  3. Removes navigation, ads, and noise
  4. Summarizes with Llama3 (3 sentences max)

📊 Example Outputs

Video Link Summary

Input: YouTube video URL

Output:

이 영상은 Python 프로그래밍 언어의 기본 개념을 소개합니다. 
변수, 함수, 클래스 등 핵심 문법을 실습 예제와 함께 설명합니다. 
초보자도 쉽게 따라할 수 있도록 단계별로 구성되어 있습니다.

Text Link Summary

Input: Blog article URL

Output:

이 글은 Docker 컨테이너 기술의 장단점을 분석합니다. 
가상화 기술과 비교하여 리소스 효율성과 배포 편의성을 강점으로 제시합니다. 
다만 보안과 복잡성 측면에서 주의가 필요하다고 조언합니다.

🏗️ Architecture

mcp-link-scan/
├── src/
│   ├── server.py              # Local server (stdio)
│   ├── server_http.py         # Remote server (HTTP)
│   ├── tools/                  # MCP tools
│   │   ├── link_scanner.py     # Main tool definitions
│   │   ├── media_handler.py    # Video processing (Whisper)
│   │   └── text_handler.py    # Text extraction
│   ├── utils/                  # Utilities
│   │   ├── link_detector.py    # Link type detection
│   │   ├── youtube_extractor.py # YouTube metadata/subtitles
│   │   └── llm_summarizer.py   # Llama3 integration
│   └── prompts/                # LLM prompts
│       └── __init__.py         # Video/text prompt templates
├── docker/
│   └── init-ollama.sh          # Ollama initialization script
├── docker-compose.yml          # Docker services
├── Dockerfile                  # Container build config
├── requirements.txt            # Python dependencies
└── run_server.py               # Server entry point

🔧 Development

Setting up Development Environment

# Clone and install
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your settings

# Start Ollama (if not using Docker)
ollama serve
ollama pull llama3:latest

Development Mode with Docker

# Start in development mode (hot reload enabled)
docker-compose up -d

# View logs
docker-compose logs -f link-scan

# Code changes are automatically reloaded

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_link_scanner.py

Customizing Prompts

Edit src/prompts/__init__.py to customize LLM prompts:

# Video summarization prompt
VIDEO_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

# Text summarization prompt
TEXT_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

Configuring Whisper Model

Edit src/tools/media_handler.py:

# Change model size (tiny, base, small, medium, large)
_whisper_model = whisper.load_model("base")  # Default: "base"

📋 Requirements

  • Python 3.11+
  • ffmpeg - Audio processing
  • Ollama - LLM runtime (for summarization)
  • yt-dlp - Video/audio download
  • openai-whisper - Speech-to-text
  • torch - PyTorch (for Whisper)
  • aiohttp - Async HTTP client
  • beautifulsoup4 - HTML parsing
  • fastapi - HTTP server framework
  • uvicorn - ASGI server
  • mcp - Model Context Protocol SDK

🌐 Deployment

PlayMCP Registration

  1. Deploy Server: Deploy to cloud hosting (Render, Railway, Fly.io, AWS, GCP, etc.)
  2. Get Server URL: Example: https://your-server.railway.app
  3. Register in PlayMCP: Use URL https://your-server.railway.app/messages

Important: Server URL must be publicly accessible and support HTTPS for production use.

Using with MCP Clients

Amazon Q CLI:

{
  "mcpServers": {
    "link-scan": {
      "command": "python",
      "args": ["run_server.py"],
      "cwd": "/path/to/mcp-link-scan"
    }
  }
}

Other MCP Clients:

{
  "mcpServers": {
    "link-scan": {
      "url": "https://your-server.com/messages"
    }
  }
}

🤝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (pytest)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Development Workflow

# Install in development mode
pip install -e .

# Run tests
pytest

# Format code (if using formatters)
black src/ tests/
isort src/ tests/

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • yt-dlp team for the excellent YouTube extraction library
  • OpenAI Whisper team for the speech-to-text model
  • Ollama team for the local LLM runtime
  • MCP team for the Model Context Protocol specification
  • Pydantic team for the data validation library

📞 Support

🗺️ Roadmap

  • Batch processing for multiple links
  • Caching layer for improved performance
  • Export functionality (JSON, CSV, etc.)
  • Advanced analytics (sentiment analysis, topic extraction)
  • Support for more video platforms (TikTok, Vimeo, etc.)
  • WebSocket support for real-time updates
  • Integration examples with popular MCP clients
  • Custom prompt templates via API
  • Multi-language support for summaries
  • Video thumbnail extraction

📝 Notes

  • Audio downloads are temporarily stored and automatically cleaned up
  • Whisper model is loaded once and reused for better performance
  • Processing time depends on video length and Whisper model size
  • YouTube videos are processed for first 7 seconds only to reduce processing time
  • All text sources (title, description, subtitles, transcription) are combined for YouTube videos
  • Summaries are limited to 3 sentences maximum
  • For production, consider using GPU for faster Whisper conversion
  • Ollama timeout is set to 5 minute for tool calls

Reviews

No reviews yet

Sign in to write a review