GCP Sales Analytics POC
A proof of concept demonstrating an intelligent AI agent that can query both Google Cloud SQL (PostgreSQL) and BigQuery public datasets, automatically choosing the right data source based on the question.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ ADK Agent (Claude) │
│ Intelligently routes queries to appropriate data source │
└────────────────┬────────────────────────────────────────────┘
│
├─────────────────┬──────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ MCP Server │ │ MCP Server │ │ Tools │
│ (Cloud SQL) │ │ (BigQuery) │ │ (Schema) │
└──────┬───────┘ └──────┬───────┘ └──────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Cloud SQL │ │ BigQuery │
│ PostgreSQL │ │ thelook_ │
│ │ │ ecommerce │
│ • customers │ │ • products │
│ • orders │ │ • users │
│ • vendors │ │ • events │
└──────────────┘ │ • inventory │
│ • orders │
└──────────────┘
Features
- Dual Data Source Access: Query both Cloud SQL and BigQuery seamlessly
- Intelligent Routing: Agent automatically determines which data source to use
- MCP Server: Standards-compliant Model Context Protocol server
- Synthetic Data: Pre-populated with 50 customers, 50 vendors, and 50 orders
- Infrastructure as Code: Terraform for Cloud SQL deployment
- Production-Ready: Error handling, logging, and security best practices
Data Sources
Cloud SQL (PostgreSQL)
Contains transactional data with three tables:
- customers: Customer information (name, email, address, etc.)
- orders: Order details (amounts, dates, status, products)
- vendors: Vendor information
Use for queries about:
- Specific customer information
- Recent order details
- Vendor data
- Current sales transactions
BigQuery (thelook_ecommerce)
Public e-commerce analytics dataset with comprehensive data:
- products, users, events
- inventory_items, order_items
- distribution_centers
Use for queries about:
- Product analytics
- User behavior patterns
- Inventory analysis
- Historical trends
- Large-scale analytics
Prerequisites
- Google Cloud Platform account with billing enabled
- GCP Project with appropriate permissions
- Tools installed:
gcloudCLIterraform(>= 1.0)python3(>= 3.9)psql(PostgreSQL client)
- Anthropic API key for Claude
Setup
1. Clone and Configure
cd /Users/matthewiames/Desktop/gcp_mcp
# Copy environment template
cp .env.example .env
# Edit .env with your values
# Required: ANTHROPIC_API_KEY, GCP_PROJECT_ID
nano .env
2. Install Python Dependencies
# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
3. Authenticate with GCP
# Login to GCP
gcloud auth login
# Set application default credentials
gcloud auth application-default login
# Set your project
gcloud config set project YOUR_PROJECT_ID
4. Deploy Infrastructure
# Run the automated deployment script
./scripts/deploy.sh
The deployment script will:
- Deploy Cloud SQL instance with Terraform
- Create database schema (customers, orders, vendors)
- Seed database with synthetic data
- Verify BigQuery access
This process takes approximately 5-10 minutes.
Usage
Running the MCP Server
# In terminal 1
python3 mcp_server/server.py
The MCP server provides tools for:
query_cloudsql- Execute SQL against Cloud SQLquery_bigquery- Execute SQL against BigQuerylist_cloudsql_tables- List Cloud SQL tableslist_bigquery_tables- List BigQuery tablesget_cloudsql_schema- Get table schema from Cloud SQLget_bigquery_schema- Get table schema from BigQuery
Running the Agent (Interactive Mode)
# In terminal 2
python3 adk_agent/agent.py
Example interactions:
📊 You: What are the total sales from our database?
🤖 Agent: I'll query the Cloud SQL database to calculate total sales...
[Uses query_cloudsql tool]
The total sales amount from all orders is $24,567.89 across 50 orders.
📊 You: Show me the top 5 products from BigQuery
🤖 Agent: I'll query the BigQuery thelook_ecommerce dataset...
[Uses query_bigquery tool]
Here are the top 5 products by sales:
1. Product A - $15,234
2. Product B - $12,456
...
Running Tests
# Run automated tests
python3 scripts/test_agent.py
Project Structure
gcp_mcp/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── .gitignore # Git ignore rules
│
├── terraform/ # Infrastructure as Code
│ ├── main.tf # Cloud SQL resources
│ ├── variables.tf # Input variables
│ ├── outputs.tf # Output values
│ └── terraform.tfvars.example
│
├── data/ # Database schema
│ └── schema.sql # Table definitions
│
├── scripts/ # Deployment and utilities
│ ├── deploy.sh # Automated deployment
│ ├── cleanup.sh # Resource cleanup
│ ├── generate_seed_data.py # Synthetic data generation
│ └── test_agent.py # Agent tests
│
├── mcp_server/ # MCP Server implementation
│ └── server.py # Database MCP server
│
└── adk_agent/ # ADK Agent implementation
└── agent.py # Sales analytics agent
How It Works
Agent Decision Making
The agent uses Claude's function calling capabilities with a specialized system prompt that guides data source selection:
- Question Analysis: Agent analyzes the user's question
- Schema Discovery: May first list tables to understand available data
- Source Selection: Chooses Cloud SQL or BigQuery based on:
- Keywords (customers, vendors → Cloud SQL)
- Query type (analytics, trends → BigQuery)
- Data recency requirements
- Query Execution: Formulates and executes appropriate SQL
- Result Presentation: Formats and explains results
Example Decision Flow
User: "What are my recent orders?"
↓
Agent thinks: "recent orders" + "my" suggests transactional data
↓
Decision: Use Cloud SQL
↓
Tool: query_cloudsql
↓
SQL: SELECT * FROM orders ORDER BY order_date DESC LIMIT 10
Configuration
Environment Variables (.env)
# GCP Configuration
GCP_PROJECT_ID=your-project-id
GCP_REGION=us-central1
CLOUDSQL_INSTANCE_NAME=sales-poc-db
CLOUDSQL_DATABASE=salesdb
CLOUDSQL_USER=salesuser
CLOUDSQL_PASSWORD=your-secure-password
# BigQuery
BIGQUERY_DATASET=bigquery-public-data.thelook_ecommerce
# Anthropic
ANTHROPIC_API_KEY=your-api-key
# Database connection (set after deployment)
DB_HOST=your-cloudsql-ip
Terraform Variables
See terraform/terraform.tfvars.example
Security Considerations
This is a proof of concept with simplified security:
- Cloud SQL has public IP (uses authorized networks)
- Database credentials in environment variables
- No VPC or private networking
- SQL injection prevention (SELECT-only queries)
For production:
- Use Cloud SQL Proxy or private IP
- Store credentials in Secret Manager
- Implement VPC and private networking
- Add query validation and sanitization
- Enable Cloud SQL backups
- Use IAM authentication
- Implement rate limiting
Cost Estimation
Approximate costs for running this POC:
- Cloud SQL (db-f1-micro): ~$7-10/month
- BigQuery: Pay per query (~$5/TB scanned, public datasets may be free)
- Anthropic API: Pay per token
Important: Run ./scripts/cleanup.sh when done to avoid ongoing charges.
Cleanup
To destroy all resources:
./scripts/cleanup.sh
This will:
- Destroy the Cloud SQL instance
- Remove all data
- Clean up Terraform state
Troubleshooting
Connection Issues
# Test Cloud SQL connection
psql -h $DB_HOST -U $CLOUDSQL_USER -d $CLOUDSQL_DATABASE
# Check instance status
gcloud sql instances describe sales-poc-db-XXXX
API Access Issues
# Enable required APIs
gcloud services enable sqladmin.googleapis.com
gcloud services enable bigquery.googleapis.com
# Check authentication
gcloud auth list
Terraform Issues
cd terraform
# Re-initialize
terraform init
# Check state
terraform show
Extending the POC
Ideas for enhancement:
-
Add More Data Sources
- Cloud Spanner
- Firestore
- External APIs
-
Enhanced Agent Capabilities
- Data visualization
- Report generation
- Predictive analytics
-
Production Features
- Caching layer
- Query optimization
- Audit logging
- Monitoring and alerts
-
Advanced MCP Features
- Streaming responses
- Batch operations
- Transaction support
Resources
- Model Context Protocol (MCP)
- Anthropic Claude API
- Google Cloud SQL
- BigQuery Public Datasets
- Terraform GCP Provider
License
This is a proof of concept for demonstration purposes.
Support
For issues or questions:
- Check the troubleshooting section
- Review logs in the terminal
- Check GCP Console for resource status