MCP Hub
Back to servers

MCP Audio RAG Server

A Retrieval-Augmented Generation (RAG) server that transcribes audio using Google Gemini and stores it in Supabase for natural language search and analysis.

Tools
6
Updated
Dec 2, 2025

MCP Audio RAG Server

Transform your audio files into a searchable knowledge base using AI. Ask Claude questions about your meetings, podcasts, lectures, or any audio content.

Buy Me A Coffee

What is this?

This is an MCP (Model Context Protocol) server that lets you:

  1. Transcribe any audio file using Google's Gemini AI
  2. Store the transcriptions in a searchable database
  3. Search through all your audio content using natural language

Once set up, you can simply ask Claude things like:

  • "What did they discuss about the budget in my meeting recording?"
  • "Find mentions of machine learning in my podcast collection"
  • "What were the key points from yesterday's lecture?"

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Audio File  │ ──▶ │   Gemini    │ ──▶ │  Chunking   │ ──▶ │  Supabase   │
│ (.mp3, etc) │     │ Transcribe  │     │ + Embedding │     │  (pgvector) │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                                                   │
┌─────────────┐     ┌─────────────┐     ┌─────────────┐            │
│   Claude    │ ◀── │   Results   │ ◀── │   Search    │ ◀──────────┘
│  Response   │     │ + Snippets  │     │   Query     │
└─────────────┘     └─────────────┘     └─────────────┘

Quick Start

Prerequisites

Step 1: Clone & Install

git clone https://github.com/matheusslg/mcp-audio-rag.git
cd mcp-audio-rag
npm install

Step 2: Set Up Supabase Database

  1. Create a new project at supabase.com
  2. Go to SQL Editor in your dashboard
  3. Paste and run the contents of supabase/schema.sql

Step 3: Get Your API Keys

Supabase (Settings → API):

  • Copy Project URLSUPABASE_URL
  • Copy service_role keySUPABASE_SERVICE_KEY

Google AI Studio:

Step 4: Configure

cp .env.example .env

Edit .env:

GEMINI_API_KEY=your-key-here
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key

Step 5: Add to Claude

For Claude Code CLI (~/.claude.json):

{
  "mcpServers": {
    "audio-rag": {
      "command": "npx",
      "args": ["tsx", "/full/path/to/mcp-audio-rag/src/server.ts"],
      "env": {
        "GEMINI_API_KEY": "your-key",
        "SUPABASE_URL": "https://your-project.supabase.co",
        "SUPABASE_SERVICE_KEY": "your-service-role-key"
      }
    }
  }
}

For Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on Mac):

Same config as above.

Usage

Transcribe Audio

Just tell Claude to transcribe a file:

Transcribe /path/to/meeting.mp3

Want to use a specific model? Just ask:

Transcribe /path/to/lecture.m4a using gemini-2.5-pro

Search Your Audio

Ask natural questions:

What did they say about the project timeline?
Search for mentions of "budget" in my recordings
Find discussions about AI in my podcasts

Manage Your Library

List all my transcribed audio files
Delete the recording from last week
Get the full transcript of meeting.mp3
Summarize the podcast episode

Available Models

ModelBest For
gemini-2.5-flashDefault - Fast & accurate, great balance
gemini-2.5-flash-liteFastest, cheapest - good for bulk processing
gemini-2.5-proBest quality - complex audio, multiple speakers
gemini-3-pro-previewNewest - cutting edge capabilities
gemini-2.0-flashReliable - previous generation
gemini-2.0-flash-liteFast - previous generation

Supported Audio Formats

.mp3 .mp4 .m4a .wav .webm .mpeg .mpga

Available Tools

ToolDescription
ingest_audioTranscribe and store an audio file
search_transcriptsSearch through your audio using natural language
list_transcriptsList all transcribed audio files
get_full_transcriptGet the complete transcript of a file
summarize_audioGenerate an AI summary of a transcript
delete_transcriptRemove a transcribed file from the database

Troubleshooting

ProblemSolution
"No relevant segments found"Try rephrasing your search, or check if audio was ingested
"Missing environment variable"Check your .env file or Claude config has all 3 keys
Supabase errorsMake sure you're using service_role key, not anon key
Slow transcriptionUse gemini-2.5-flash-lite for faster processing

Support This Project

If this project saved you time or helped you out, consider buying me a coffee!

Buy Me A Coffee

License

MIT - Use it however you want!


Made with Gemini + Supabase + Claude

Reviews

No reviews yet

Sign in to write a review