The Data Collector
Web scraping APIs for Bluesky, Substack, and Hacker News with x402 micropayment support. Built with FastAPI.
Live: https://frog03-20494.wykr.es
Features
- Search Bluesky posts (AT Protocol), Substack newsletters, and Hacker News stories
- Returns structured JSON with engagement metrics
- x402 micropayments ($0.05 USDC on Base per call) — no account needed
- API key authentication for regular use
- A2A Agent Card and MCP discovery endpoints
- OpenAPI spec with x402 payment metadata
API Endpoints
| Method | Endpoint | Description | Price |
|---|---|---|---|
| POST | /api/bluesky/search | Search Bluesky posts by keyword | $0.05 |
| POST | /api/substack/search | Scrape Substack newsletter articles | $0.05 |
| POST | /api/hn/search | Search Hacker News stories | $0.05 |
Quick Start
# Clone
git clone https://github.com/MarcinDudekDev/the-data-collector.git
cd the-data-collector
# Install
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Configure
cp .env.example .env
# Edit .env with your APIFY_TOKEN and API_KEY
# Run
uvicorn server:app --host 0.0.0.0 --port 8001
Docker
docker build -t the-data-collector .
docker run -p 8001:8001 --env-file .env the-data-collector
Environment Variables
| Variable | Required | Description |
|---|---|---|
APIFY_TOKEN | Yes | Apify API token for running scrapers |
API_KEY | No | API key for authenticated access (X-API-Key header) |
BASE_URL | No | Public URL of the server (default: https://frog03-20494.wykr.es) |
PAY_TO | No | Wallet address for x402 payments |
PRICE_ATOMIC | No | Price per call in USDC atomic units (default: 50000 = $0.05) |
Authentication
x402 Micropayments (no account needed)
Send a POST request without credentials. You'll receive a 402 response with payment requirements. Pay $0.05 USDC on Base — settlement is instant.
# First call returns 402 with payment details
curl -X POST https://frog03-20494.wykr.es/api/hn/search \
-H "Content-Type: application/json" \
-d '{"searchTerms": ["AI agents"]}'
API Key
curl -X POST https://frog03-20494.wykr.es/api/hn/search \
-H "Content-Type: application/json" \
-H "X-API-Key: your-key" \
-d '{"searchTerms": ["AI agents"], "maxResults": 5}'
Discovery Endpoints
| Endpoint | Protocol |
|---|---|
/.well-known/mcp.json | MCP (Model Context Protocol) |
/.well-known/agent-card.json | A2A (Agent-to-Agent) |
/.well-known/x402 | x402 payment discovery |
/.well-known/openapi.json | OpenAPI 3.1 spec |
/health | Health check |
MCP Client Configuration
{
"mcpServers": {
"the-data-collector": {
"url": "https://frog03-20494.wykr.es/.well-known/mcp.json"
}
}
}
License
MIT