Overview
The MABQ BigQuery Agent integrates with Google BigQuery through the BigQueryToolset from Google ADK. This integration enables natural language to SQL translation with automated query execution and validation.BigQueryToolset Setup
The toolset is configured inagent.py:32-35:
Components
Tool Configuration
Controls security settings and operational behavior (see Configuration).
Credentials Configuration
The agent uses Application Default Credentials (ADC) for authentication.Automatic Authentication
Defined inagent.py:28-30:
How It Works
google.auth.default() Behavior
google.auth.default() Behavior
The
google.auth.default() function automatically discovers credentials in the following order:- Environment Variable:
GOOGLE_APPLICATION_CREDENTIALSpointing to a service account key - Cloud Run/Cloud Functions: Service identity attached to the runtime
- Compute Engine/GKE: Metadata server credentials
- gcloud CLI: User credentials from
gcloud auth application-default login
Deployment Environments
Required Permissions
The service account or user credentials must have:| Role | Purpose | Required |
|---|---|---|
roles/bigquery.dataViewer | Read access to datasets and tables | Yes |
roles/bigquery.jobUser | Execute BigQuery jobs (queries) | Yes |
roles/bigquery.dataEditor | Write/modify data | No (blocked by WriteMode) |
roles/bigquery.admin | Full BigQuery administration | No |
Dataset and Project Configuration
The agent operates on a specific project and dataset combination.Project Initialization
Vertex AI is initialized with the target project inagent.py:21:
The Google Cloud project ID containing BigQuery datasets (e.g.,
datawarehouse-des).The region for Vertex AI model endpoints (e.g.,
us-east4).Dataset Scoping
The agent is scoped to a specific dataset through the instruction prompt:Multi-Dataset Configuration
Working with Multiple Datasets
Working with Multiple Datasets
To allow the agent to query multiple datasets, modify the instruction prompt:
Ensure the service account has
bigquery.dataViewer permissions on all target datasets.Read-Only Security Controls
The agent implements multiple layers of read-only enforcement.Layer 1: BigQuery Tool Configuration
Hardware-level write blocking throughWriteMode.BLOCKED:
Layer 2: Instruction Prompt Guardrails
Software-level prevention through LLM instruction:Layer 3: IAM Permissions
Cloud-level access control through service account roles:Defense in Depth
Even if a malicious prompt bypasses the LLM layer,
WriteMode.BLOCKED and IAM permissions provide redundant protection.Query Execution Flow
Understanding how the agent executes queries:Tool Usage in Practice
The agent usesbigquery_toolset to:
- Validate Syntax: Ensure generated SQL is valid
- Test Execution: Confirm the query runs without errors
- Verify Schema: Check that table/column names exist
- Preview Results: Ensure the query returns expected data types
Example Tool Interaction
Example Tool Interaction
User: “Show me all active assets”Agent Internal Process:User Receives:
Connection Management
BigQueryToolset handles connection pooling and resource management automatically.Automatic Features
- Connection Pooling: Reuses BigQuery client connections
- Retry Logic: Automatically retries transient failures
- Timeout Handling: Cancels long-running queries
- Resource Cleanup: Closes connections when the agent terminates
No Manual Configuration Required
Troubleshooting
Authentication Errors
Authentication Errors
Error:
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentialsSolutions:- Run
gcloud auth application-default loginlocally - Set
GOOGLE_APPLICATION_CREDENTIALSenvironment variable - Verify service account is attached to Cloud Run service
Permission Denied Errors
Permission Denied Errors
Error:
403 Forbidden: Access Denied: BigQuery BigQuery: Permission deniedSolutions:- Verify service account has
roles/bigquery.dataViewer - Ensure
roles/bigquery.jobUseris granted - Check dataset-level permissions
Write Operation Blocked
Write Operation Blocked
Error: Query contains write operation but Action: No action needed. This is the security system working as designed.
WriteMode.BLOCKED is enabledExpected Behavior: This is correct! The agent should respond:Dataset Not Found
Dataset Not Found
Error:
404 Not found: Dataset PROJECT_ID:DATASET_NAMESolutions:- Verify
BIGQUERY_DATASETenvironment variable is correct - Check that dataset exists in the project
- Confirm project ID is correct