YouTube Transcript MCP Server
An MCP (Model Context Protocol) server that provides YouTube transcript fetching capabilities with Google OAuth 2.0 authentication. Only authorized users can access transcript data.
Features
- Fetch YouTube video transcripts from video IDs or full URLs
- Support for both manually created and auto-generated transcripts
- Optional timestamp inclusion for each transcript segment
- List available transcripts for any video
- Google OAuth 2.0 authentication with email-based authorization
- Automatic video ID extraction from various YouTube URL formats
Installation
- Clone this repository
- Install dependencies:
pip install -r requirements.txt
Google OAuth Setup
1. Create Google Cloud Project
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Identity API (if not already enabled)
2. Create OAuth 2.0 Credentials
- Navigate to APIs & Services > Credentials
- Click Create Credentials > OAuth client ID
- Select Web application as the application type
- Add authorized redirect URIs:
- For local development:
http://localhost:8080 - For production: Your application's callback URL
- For local development:
- Click Create
- Download the client configuration or note your Client ID
3. Configure Environment Variables
- Copy the example environment file:
cp .env.example .env
- Edit
.envand add your configuration:
GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
AUTHORIZED_EMAILS=user1@gmail.com,user2@example.com
GOOGLE_CLIENT_ID: Your OAuth 2.0 client ID from Google Cloud ConsoleAUTHORIZED_EMAILS: Comma-separated list of email addresses allowed to use the service
Running the Server
Option 1: Docker (Recommended)
- Build and run with docker-compose:
docker-compose up -d
- Or build and run manually:
docker build -t youtube-transcript-mcp .
docker run -p 8000:8000 --env-file .env youtube-transcript-mcp
The server will be available at: http://localhost:8000/sse
Option 2: Direct Python
The server runs with SSE (Server-Sent Events) transport on port 8000 by default:
python youtube_transcript_mcp.py
The server will be available at: http://localhost:8000/sse
For MCP Inspector or other MCP clients, use:
- Transport Type: SSE
- URL:
http://localhost:8000/sse
Available Tools
1. youtube_get_transcript
Fetches the transcript for a YouTube video.
Parameters:
video_input(string, required): YouTube video ID or full URL- Examples:
dQw4w9WgXcQorhttps://youtube.com/watch?v=dQw4w9WgXcQ
- Examples:
cursor(integer, optional, default: 0): Starting segment index for pagination
Returns: Markdown-formatted transcript with timestamps
How Pagination Works:
- The tool automatically fits as many segments as possible within the MCP response size limit (25,000 characters)
- If the transcript is too long, it returns a chunk and tells you there's more
- Use the
cursorparameter from the response to fetch the next chunk
Example (start from beginning):
{
"video_input": "dQw4w9WgXcQ"
}
Example (fetch next page):
{
"video_input": "dQw4w9WgXcQ",
"cursor": 250
}
Response Format: When the transcript is paginated, the response includes:
Showing segments X-Y of Z: Current page rangeHas more: Whether there are more segments to fetchNext cursor: Value to use for fetching the next batch (only if has_more is true)
2. youtube_list_available_transcripts
Lists all available transcripts for a YouTube video.
Parameters:
video_input(string, required): YouTube video ID or full URLauth_token(string, required): Google OAuth 2.0 ID token
Returns: Markdown-formatted list of available transcripts with language information
Example:
{
"video_input": "https://youtu.be/dQw4w9WgXcQ",
"auth_token": "your-google-oauth-token"
}
Supported URL Formats
The server automatically extracts video IDs from these URL formats:
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/embed/VIDEO_IDhttps://www.youtube.com/v/VIDEO_ID- Direct video ID:
VIDEO_ID
Authentication Flow
- User authenticates with Google OAuth 2.0
- Client obtains an ID token from Google
- Client passes the ID token in the
auth_tokenparameter - Server validates the token with Google
- Server checks if the user's email is in the authorized list
- If authorized, the tool executes; otherwise, access is denied
Error Handling
The server provides clear error messages for common issues:
- Invalid/expired token: "Invalid or expired authentication token"
- Unauthorized user: "Access denied. User X is not authorized"
- Video unavailable: "Video is unavailable. It may be private, deleted, or the ID is incorrect"
- No transcripts: "No transcripts available for this video"
- Transcripts disabled: "Transcripts are disabled for this video"
Security Considerations
- OAuth tokens are validated on every request
- Only users in the
AUTHORIZED_EMAILSlist can access tools - Client secrets should never be committed to version control
- Store
.envsecurely and never share publicly - The server is read-only and cannot modify YouTube data
Limitations
- Currently only supports English transcripts
- Requires internet connection to fetch transcripts and validate tokens
- Subject to YouTube's rate limits and availability
- Very long transcripts should use pagination to stay within MCP response size limits
License
MIT