MCP Hub
Back to servers

vidlens-mcp

VidLens MCP — The YouTube intelligence layer for AI agents. 41 tools, zero config, three-tier fallback, semantic + visual search.

npm1.5k/wk
Updated
Mar 17, 2026

Quick Install

npx -y vidlens-mcp

VidLens — The YouTube intelligence layer for AI agents

npm License MCP 41 tools Zero Config


🔍 What is VidLens?

VidLens is a Model Context Protocol server that gives AI agents deep, reliable access to YouTube. Not just transcripts - full intelligence: sentiment analysis, trend discovery, semantic search, media assets, creator analytics, and image-backed visual search.

No API key required to start. Every tool has a three-tier fallback chain (YouTube API → yt-dlp → page extraction) so nothing breaks when quota runs out or keys aren't configured.


🎯 Core Capabilities

🔎 Semantic Search Across Playlists

Import entire playlists or video sets, index every transcript with Gemini embeddings, and search across hundreds of hours of content by meaning — not just keywords.

"Find every mention of gradient descent across 50 Stanford CS lectures"

"What did the instructor say about backpropagation in any of these videos?"

👁️ Visual Search — See What's In Videos

Extract keyframes, describe them with Gemini Vision, run OCR on slides and whiteboards, and search by what you see — not just what's said. Three layers: Apple Vision feature prints for image similarity, Gemini frame descriptions for scene understanding, and semantic embeddings for text→visual search.

"Find the frame where he draws the system architecture diagram"

"Show me every slide that mentions 'transformer architecture'"

📊 Intelligence Layer — Not Just Data

Sentiment analysis with themes and risk signals. Niche trend discovery with momentum and saturation scoring. Content gap detection. Hook pattern analysis. Upload timing recommendations. The LLM does the thinking — VidLens gives it the right data.

"What's the audience sentiment on this video? Any risk signals?"

"What's trending in the AI coding niche right now?"

⚡ Zero Config, Always Works

No API key needed to start. Three-tier fallback chain on every tool: YouTube API → yt-dlp → page extraction. Nothing breaks when quota runs out. Keys are optional power-ups, not requirements.

🎬 Full Media Pipeline

Download videos/audio/thumbnails. Extract keyframes. Index comments for semantic search. Build a local knowledge base from any YouTube content — all through natural language.


⚡ Why VidLens?

VidLensOther YouTube MCP servers
🔑 Setup✅ Works immediately - no keys needed❌ Most require YouTube API key upfront
🛡️ Reliability✅ Three-tier fallback on every tool❌ Single point of failure - API down = broken
🧠 Intelligence✅ Sentiment, trends, content gaps, hooks❌ Raw data dumps - you do the analysis
📦 Token efficiency✅ 75-87% smaller responses❌ Verbose JSON with thumbnails, etags, junk
🔬 Depth✅ 40 tools across 9 modules⚠️ 1-5 tools, mostly transcripts only
🖼️ Visual evidence✅ Returns actual frame paths + timestamps, not just text hits⚠️ Usually transcript-only or raw frame dumps
⚖️ Trademark✅ Compliant naming⚠️ Most violate YouTube trademark

🚀 Quick Start

1. Install

npx vidlens-mcp setup

This auto-detects your MCP clients (Claude Desktop, Claude Code) and configures both.

2. Or configure manually

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "vidlens-mcp": {
      "command": "npx",
      "args": ["-y", "vidlens-mcp", "serve"]
    }
  }
}

Claude Code — add to ~/.claude/settings.json:

{
  "mcpServers": {
    "vidlens-mcp": {
      "command": "npx",
      "args": ["-y", "vidlens-mcp", "serve"]
    }
  }
}

3. Restart your MCP client

Fully quit and reopen Claude Desktop (⌘Q). Claude Code picks up changes automatically.

4. Try it

"Import this playlist and search across all videos for mentions of machine learning"

"Search this video's visuals for the whiteboard architecture diagram and show me the frame evidence"

"What's trending in the AI coding niche right now?"

"Build a complete dossier for this video — metadata, transcript, sentiment, hooks, everything"

"What's the audience sentiment on this video? Any risk signals?"

"Get the transcript of this video: https://youtube.com/watch?v=dQw4w9WgXcQ"


🧰 Tools - 40 across 9 modules

📺 Core - Video & Channel Intelligence

Always available, no API key needed

ToolWhat it does
findVideosSearch YouTube by query with metadata
inspectVideoDeep metadata - tags, engagement, language, category
inspectChannelChannel stats, description, recent uploads
listChannelCatalogBrowse a channel's full video library
readTranscriptFull transcript with timestamps and chapters
readCommentsTop comments with likes and engagement
expandPlaylistList all videos in any playlist

🔎 Knowledge Base - Semantic Search

Index transcripts and search across them with natural language

ToolWhat it does
importPlaylistIndex an entire playlist's transcripts
importVideosIndex specific videos by URL/ID
searchTranscriptsNatural language search across indexed content
listCollectionsBrowse your indexed collections
setActiveCollectionScope searches to one collection
clearActiveCollectionSearch across all collections
removeCollectionDelete a collection and its index

💬 Sentiment & Analysis

Understand what audiences think and feel

ToolWhat it does
measureAudienceSentimentComment sentiment with themes and risk signals
analyzeVideoSetCompare performance across multiple videos
analyzePlaylistPlaylist-level engagement analytics
buildVideoDossierComplete single-video deep analysis

🎯 Creator Intelligence

Insights for content strategy

ToolWhat it does
scoreHookPatternsAnalyze what makes video openings work
researchTagsAndTitlesTag and title optimization insights
compareShortsVsLongShort-form vs long-form performance
recommendUploadWindowsBest times to publish for engagement

📈 Discovery & Trends

Find what's working in any niche

ToolWhat it does
discoverNicheTrendsMomentum, saturation, content gaps in any topic
exploreNicheCompetitorsChannel landscape and top performers

🎬 Media Assets

Download and manage video files locally

ToolWhat it does
downloadAssetDownload video, audio, or thumbnails
listMediaAssetsBrowse stored media files
removeMediaAssetClean up downloaded assets
extractKeyframesExtract key frames from videos
mediaStoreHealthStorage usage and diagnostics

🖼️ Visual Search

Three-layer visual intelligence. Not transcript reuse.

ToolWhat it does
indexVisualContentExtract frames, run Apple Vision OCR + feature prints, Gemini frame descriptions, and Gemini semantic embeddings
searchVisualContentSearch visual frames using semantic embeddings + lexical matching. Returns actual image paths + timestamps as evidence
findSimilarFramesImage-to-image frame similarity using Apple Vision feature prints

Three layers, all real:

  1. Apple Vision feature prints — image-to-image similarity (find frames that look alike)
  2. Gemini 2.5 Flash frame descriptions — natural language scene understanding per frame
  3. Gemini semantic embeddings — 768-dim embedding retrieval over OCR + description text for true text→visual search

What you always get back: frame path on disk, timestamp, source video URL/title, match explanation, OCR text, visual description.

What is NOT happening: no transcript embeddings are reused for visual search. This is a separate visual index.

💭 Comment Knowledge Base

Index and semantically search YouTube comments

ToolWhat it does
importCommentsIndex a video's comments for search
searchCommentsNatural language search over comment corpus
listCommentCollectionsBrowse comment collections
setActiveCommentCollectionScope comment searches
clearActiveCommentCollectionSearch all comment collections
removeCommentCollectionDelete a comment collection

🏥 Diagnostics

Health checks and pre-flight validation

ToolWhat it does
checkSystemHealthFull system diagnostic report
checkImportReadinessValidate before importing content

🔑 API Keys (Optional)

VidLens works without any API keys. Add them to unlock more capabilities:

KeyWhat it unlocksFree?How to get it
YOUTUBE_API_KEYBetter metadata, comment API, search via YouTube API✅ Free tier (10,000 units/day)Google Cloud Console → APIs → Enable YouTube Data API v3 → Credentials → Create API Key
GEMINI_API_KEYHigher-quality embeddings for semantic search (768d vs 384d)✅ Free tierGoogle AI Studio → Get API Key

⚠️ These are separate keys from separate Google services. A Gemini key will NOT work for YouTube API calls and vice versa. Create them independently.

# Configure via setup wizard
npx vidlens-mcp setup --youtube-api-key YOUR_YOUTUBE_KEY --gemini-api-key YOUR_GEMINI_KEY

# Or via environment variables
export YOUTUBE_API_KEY=your_youtube_key
export GEMINI_API_KEY=your_gemini_key

💻 CLI

npx vidlens-mcp               # Start MCP server (stdio)
npx vidlens-mcp serve         # Start MCP server (explicit)
npx vidlens-mcp setup         # Auto-configure Claude Desktop + Claude Code
npx vidlens-mcp doctor        # Run diagnostics
npx vidlens-mcp version       # Print version
npx vidlens-mcp help          # Usage guide

Doctor - diagnose issues

npx vidlens-mcp doctor --no-live

Checks: Node.js version, yt-dlp availability, API key validation, data directory health, MCP client registration (Claude Desktop, Claude Code).


🏗️ Architecture

System Overview

VidLens System Overview

How the Fallback Chain Works

Every tool that touches YouTube data uses the same resilience pattern:

VidLens Fallback Chain

Every response includes a provenance field telling you exactly which tier served the data and whether anything was partial. No silent degradation — you always know what happened.

Visual Search Pipeline

Visual search is not transcript reuse. It's a dedicated three-layer index:

VidLens Visual Search Pipeline

Three layers, all real:

  1. Apple Vision feature prints — image-to-image similarity (find frames that look alike)
  2. Gemini Vision frame descriptions — natural language scene understanding per frame
  3. Gemini semantic embeddings — 768-dim retrieval over OCR + description text

Data Storage

Everything lives in a single directory. No external databases, no Docker, no infrastructure.

VidLens Data Storage

One directory. Portable. Back it up by copying. Delete it to start fresh.


📋 Requirements

RequirementStatusNotes
Node.js ≥ 22RequiredUses node:sqlitenode --version to check
yt-dlpRecommendedbrew install yt-dlp - enables zero-config mode
ffmpegOptionalNeeded for frame extraction and visual indexing
YouTube API keyOptionalUnlocks comments, better metadata
Gemini API keyOptionalUpgrades transcript embeddings and frame descriptions for visual search
macOS Apple VisionAutomatic on macOSPowers native OCR and image similarity for visual search

🔧 Troubleshooting

"Tool not found" in Claude Desktop

Fully quit Claude Desktop (⌘Q, not just close window) and reopen. MCP servers only load on startup.

"YOUTUBE_API_KEY not configured" warning

This is informational, not an error. VidLens works without it. Add a key only if you need comments/sentiment features.

"API_KEY_SERVICE_BLOCKED" error

Your API key has restrictions. Create a new unrestricted key in Google Cloud Console, or remove the API restriction from the existing key.

Gemini key doesn't work for YouTube API

These are separate services. You need a YouTube API key from Google Cloud Console AND a Gemini key from Google AI Studio. They are not interchangeable.

Build errors

npx vidlens-mcp doctor     # Run diagnostics
npx vidlens-mcp doctor --no-live  # Skip network checks

📄 License

MIT


GitHub · npm · Model Context Protocol

Reviews

No reviews yet

Sign in to write a review