MCP Hub
Back to servers

Amazon Q Web Documentation Reader

This MCP server enables AI agents to intelligently navigate, parse, and extract clean documentation from websites by removing clutter and specifically targeting code examples and page hierarchies.

Tools
5
Updated
Dec 6, 2025

🌐 Amazon Q Web Documentation Reader

MCP Server for Intelligent Web Content Extraction

Python MCP License

A Model Context Protocol (MCP) server that enables Amazon Q to intelligently navigate and extract documentation from websites.
Amazon Q uses Claude 4.5 to make smart decisions about which pages to visit and what content to extract.

FeaturesInstallationSetupUsageTools


✨ Features

  • 🧠 Intelligent Navigation - Amazon Q (Claude 4.5) decides which documentation pages to visit
  • 🧹 Clean Content Extraction - Removes navigation, ads, scripts, and other non-content elements
  • 📝 Multiple Output Formats - Supports both Markdown and plain text output
  • 💻 Code Block Extraction - Specifically extracts code examples from documentation
  • 📊 Page Structure Analysis - Extracts heading hierarchy and table of contents
  • 🔗 Link Discovery - Finds and filters documentation links
  • 📚 Batch Processing - Read multiple documentation pages at once

🎯 How It Works

User: "I'm having issues with Razorpay routes"
      Documentation: https://razorpay.com/docs

Amazon Q (Claude 4.5):
  1. Reads main docs page
  2. Sees links: ["Payments", "Routes", "Webhooks", ...]
  3. Intelligently decides: "Routes link is relevant!"
  4. Navigates to Routes documentation
  5. Extracts content and solves your problem

All navigation decisions = Amazon Q's Claude brain 🧠
MCP Server = Clean content extraction tool 🛠️

📦 Installation

Prerequisites

Step 1: Clone the Repository

git clone https://github.com/yourusername/amazon-q-web_search.git
cd amazon-q-web_search

Step 2: Install Dependencies

Using uv (Recommended):

uv sync

Using pip:

pip install -e .

🔧 Setup with Amazon Q

Step 1: Locate Your MCP Configuration File

Amazon Q looks for MCP server configuration in:

  • Linux/WSL: ~/.aws/amazonq/mcp.json
  • macOS: ~/.aws/amazonq/mcp.json
  • Windows: %USERPROFILE%\.aws\amazonq\mcp.json

Step 2: Create/Edit the Configuration File

Create the directory if it doesn't exist:

mkdir -p ~/.aws/amazonq

Edit or create ~/.aws/amazonq/mcp.json:

For Linux/WSL:

{
  "mcpServers": {
    "doc_reader": {
      "command": "/full/path/to/amazon-q-web_search/.venv/bin/python",
      "args": ["/full/path/to/amazon-q-web_search/main.py"]
    }
  }
}

For macOS:

{
  "mcpServers": {
    "doc_reader": {
      "command": "/full/path/to/amazon-q-web_search/.venv/bin/python",
      "args": ["/full/path/to/amazon-q-web_search/main.py"]
    }
  }
}

For Windows:

{
  "mcpServers": {
    "doc_reader": {
      "command": "C:\\full\\path\\to\\amazon-q-web_search\\.venv\\Scripts\\python.exe",
      "args": ["C:\\full\\path\\to\\amazon-q-web_search\\main.py"]
    }
  }
}

💡 Tip: Replace /full/path/to/ with the actual path where you cloned the repository.

Step 3: Verify Installation

  1. Start Amazon Q CLI:

    q chat
    
  2. Check if MCP server is loaded:

    /mcp
    

    You should see:

    doc_reader
      - read_web_documentation
      - get_documentation_links
      - get_page_structure
      - extract_code_examples
      - read_multiple_docs
    
  3. If not loaded:

    • Check the file path in mcp.json is correct
    • Restart Amazon Q CLI
    • Check logs: q chat logdump

🚀 Usage

Basic Example

In Amazon Q CLI, simply ask about documentation:

I'm having issues with Razorpay routes. Can you help me understand how they work?
Documentation: https://razorpay.com/docs/

Amazon Q will:

  1. ✅ Read the main documentation page
  2. ✅ Extract all available links
  3. ✅ Intelligently identify the "Routes" link
  4. ✅ Navigate to the Routes documentation
  5. ✅ Provide you with accurate information

More Examples

Python Documentation:

Can you explain Python asyncio event loops?
Documentation: https://docs.python.org/3/library/asyncio.html

FastAPI Tutorial:

How do I create a basic FastAPI application?
Documentation: https://fastapi.tiangolo.com/

AWS Lambda:

How do I create a Lambda function with Python?
Documentation: https://docs.aws.amazon.com/lambda/

🛠 Available Tools

Amazon Q intelligently chains these tools to navigate documentation:

1. read_web_documentation

Fetches and extracts clean documentation content from a web page.

Parameters:

  • url (required): The URL of the documentation page
  • output_format (optional): "markdown" (default) or "text"

Returns: Extracted documentation content with title and metadata


2. get_documentation_links

Extracts all links from a documentation page with optional filtering.

Parameters:

  • url (required): The URL of the documentation page
  • filter_pattern (optional): Pattern to filter links (e.g., "api", "guide")

Returns: List of links found on the page


3. get_page_structure

Extracts the heading structure and table of contents from a documentation page.

Parameters:

  • url (required): The URL of the documentation page

Returns: Hierarchical structure of headings on the page


4. extract_code_examples

Extracts all code blocks from a documentation page.

Parameters:

  • url (required): The URL of the documentation page

Returns: All code blocks found with their detected languages


5. read_multiple_docs

Reads multiple documentation pages and combines their content.

Parameters:

  • urls (required): List of documentation URLs (max 10)

Returns: Combined content from all pages


📁 Project Structure

amazon-q-web_search/
├── main.py              # Entry point
├── pyproject.toml       # Project configuration
├── README.md            # This file
├── run_mcp.sh           # Startup script (Linux/macOS)
└── src/
    ├── __init__.py      # Package initialization
    ├── server.py        # MCP server initialization
    ├── config.py        # Configuration constants
    ├── fetcher.py       # HTTP fetching logic
    ├── extractor.py     # HTML content extraction
    ├── formatters.py    # Output formatting
    └── tools.py         # MCP tool definitions

⚙️ Configuration

Edit src/config.py to customize behavior:

SettingDefaultDescription
HTTP_TIMEOUT30.0sRequest timeout in seconds
MAX_CONTENT_LENGTH10MBMaximum content size in bytes
USER_AGENTCustomHTTP User-Agent string
REMOVE_TAGSVariousHTML tags to remove during extraction
CONTENT_SELECTORSVariousSelectors for finding main content

🐛 Troubleshooting

MCP Server Not Loading

Check configuration:

cat ~/.aws/amazonq/mcp.json

Verify paths are correct:

  • Use absolute paths, not relative
  • Check that Python executable exists
  • Check that main.py exists

Test server manually:

cd /path/to/amazon-q-web_search
.venv/bin/python main.py

Check Amazon Q logs:

q chat logdump

Server Starts But Tools Don't Work

Verify dependencies are installed:

cd /path/to/amazon-q-web_search
.venv/bin/python -c "import httpx, bs4, markdownify; print('OK')"

Reinstall dependencies:

uv sync --reinstall

Connection Timeout

Increase timeout in settings:

q settings mcp.initTimeout 60000

📚 Dependencies

PackagePurpose
httpxAsync HTTP client for fetching web pages
beautifulsoup4HTML parsing and navigation
lxmlFast XML/HTML parser
markdownifyHTML to Markdown conversion
mcpModel Context Protocol SDK

⚠️ Limitations

LimitValue
Maximum content size10MB per page
Maximum URLs per batch10
Request timeout30 seconds
Content typeHTML only

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


💬 Support

  • 📫 Open an Issue for bug reports or feature requests
  • ⭐ Star this repo if you find it useful!

Built with ❤️ for Amazon Q Developer

Reviews

No reviews yet

Sign in to write a review