Skip to main content

Overview

The MABQ BigQuery Agent uses Google ADK Tools to interact with BigQuery. Tools enable the agent to execute queries, validate SQL syntax, and retrieve schema information.
Tools are functions that the LLM can invoke to perform actions beyond text generation. In this agent, the BigQueryToolset provides query execution capabilities.

BigQueryToolset

The BigQueryToolset is a collection of tools for interacting with BigQuery datasets.

Initialization

Configured in agent.py:32-35:
bigquery_toolset = BigQueryToolset(
  credentials_config=credentials_config, 
  bigquery_tool_config=tool_config
)

Configuration Components

1

Credentials Configuration

Provides authentication to BigQuery:
credentials, _ = google.auth.default()
credentials_config = BigQueryCredentialsConfig(credentials=credentials)
2

Tool Configuration

Controls security and operational settings:
tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,
)

Included Tools

The BigQueryToolset provides multiple tools to the agent:
ToolPurposeUsage
Query ExecutorRuns SQL queries and returns resultsPrimary tool for data retrieval
Schema InspectorLists tables and column definitionsValidates table/column existence
Query ValidatorChecks SQL syntax without executionPre-execution validation
Metadata ReaderRetrieves dataset and table metadataUnderstands data structure
The agent automatically selects which tool to use based on the user’s request:
# User: "What tables are in the dataset?"
# Agent uses: Schema Inspector

# User: "Show me all active assets"
# Agent uses: Query Executor

# User: "Is this SQL valid: SELECT * FROM assets"
# Agent uses: Query Validator
The LLM determines tool selection based on the instruction prompt and user intent.

WriteMode.BLOCKED Security Setting

The most critical security feature of the agent is the WriteMode.BLOCKED configuration.

Configuration

Defined in agent.py:24-26:
tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,  # ← Enforces read-only mode
)

How WriteMode.BLOCKED Works

1

Query Analysis

Before execution, the toolset analyzes the SQL statement to detect write operations.
2

Write Detection

Identifies prohibited keywords:
  • INSERT, UPDATE, DELETE
  • CREATE, ALTER, DROP
  • TRUNCATE, MERGE
  • GRANT, REVOKE
3

Rejection

If write operations are detected, the tool raises an error before executing the query.

WriteMode Comparison

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,
)

# Behavior:
# SELECT queries: ✓ Allowed
# INSERT/UPDATE/DELETE: ✗ Blocked with error
# CREATE/ALTER/DROP: ✗ Blocked with error
Never use WriteMode.ALLOWED in production. The current configuration with WriteMode.BLOCKED is the only safe setting for production analytics agents.

Error Handling with WriteMode.BLOCKED

When a write operation is attempted:
# User request: "Delete all inactive assets"

# Agent generates SQL:
# DELETE FROM `datawarehouse-des.STG_ACTIVOS.assets`
# WHERE status = 'inactive'

# BigQueryToolset behavior:
# 1. Detects DELETE statement
# 2. Raises WriteOperationBlockedError
# 3. Agent receives error feedback

# Agent response (per instruction prompt):
# "Lo siento, por seguridad corporativa tengo acceso de solo lectura a los datos de TRANSELEC S.A."

Tool Execution in the Agent

The agent uses tools through a structured execution flow.

Execution Flow

1

User Request

User submits a natural language question:
"¿Cuántos activos tenemos en total?"
2

SQL Generation

Agent generates candidate SQL based on instruction prompt:
SELECT COUNT(*) as total_activos
FROM `datawarehouse-des.STG_ACTIVOS.assets`
3

Tool Invocation

Agent calls BigQueryToolset to validate and execute:
result = bigquery_toolset.execute_query(sql_candidate)
4

Result Processing

Tool returns execution results or errors:
{
  "success": true,
  "rows": [{"total_activos": 1523}],
  "schema": [...]
}
5

Response Formatting

Agent formats the response per instruction guidelines:
SELECT COUNT(*) as total_activos
FROM `datawarehouse-des.STG_ACTIVOS.assets`

Mandatory Tool Usage

The instruction prompt requires tool usage for data queries:
# From agent.py:56-57
"""
2. EJECUCIÓN OBLIGATORIA: Para cualquier pregunta de datos, DEBES usar la herramienta `bigquery_toolset` para probar que tu query funciona.
"""
The agent cannot return SQL without first validating it through the BigQueryToolset. This ensures all generated queries are syntactically correct and executable.

Tool Usage Examples

User: “Show all assets”Agent Process:
  1. Generate SQL: SELECT * FROM \datawarehouse-des.STG_ACTIVOS.assets“
  2. Execute via toolset to validate
  3. Receive confirmation query works
  4. Return SQL to user
User Receives:
SELECT * 
FROM `datawarehouse-des.STG_ACTIVOS.assets`
User: “What is the average maintenance cost by asset type?”Agent Process:
  1. Generate SQL:
SELECT 
  asset_type,
  AVG(maintenance_cost) as avg_cost
FROM `datawarehouse-des.STG_ACTIVOS.assets`
GROUP BY asset_type
ORDER BY avg_cost DESC
  1. Execute via toolset
  2. Verify query returns expected schema
  3. Return SQL to user
User: “Delete all old records”Agent Process:
  1. Generate SQL: DELETE FROM \datawarehouse-des.STG_ACTIVOS.assets` WHERE date < ‘2020-01-01’`
  2. Execute via toolset
  3. Tool raises WriteOperationBlockedError
  4. Agent receives error feedback
  5. Instruction prompt triggers rejection message
User Receives:
Lo siento, por seguridad corporativa tengo acceso de solo lectura a los datos de TRANSELEC S.A.

Integration with Instruction Prompt

The instruction prompt explicitly references and guides tool usage.

Tool Reference

From agent.py:50-52:
<INSTRUCTIONS>
  - Tienes acceso a `bigquery_toolset`.
  - Tu prioridad absoluta es la sintaxis correcta y el uso de los nombres de tabla reales.
</INSTRUCTIONS>

Operational Rules

The prompt defines how tools should be used:
# agent.py:56-67
"""
Reglas:
1. SALUDOS: Si es un saludo ("hola"), responde breve y amable.
2. EJECUCIÓN OBLIGATORIA: Para cualquier pregunta de datos, DEBES usar la herramienta `bigquery_toolset` para probar que tu query funciona.
3. SALIDA FINAL ESTRICTA:
   Una vez que la herramienta confirme que la query funciona, tu respuesta final al usuario debe ser **EXCLUSIVAMENTE** el bloque de código SQL.
   
   PROHIBIDO:
   - No escribas "Aquí está la consulta".
   - No expliques qué hace la consulta.
   - No resumas los datos encontrados.
"""

Prompt-Tool Synergy

1

Tool Awareness

The prompt informs the LLM that it has access to bigquery_toolset.
2

Mandatory Usage

The prompt requires tool usage for all data questions.
3

Validation Focus

The prompt emphasizes using tools to verify syntax and table names.
4

Output Formatting

After tool confirmation, the prompt dictates the exact response format.

Tool Configuration Best Practices

Security Practices

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,  # ✓ Always use in production
)

Credential Management

1

Use Application Default Credentials

Prefer google.auth.default() over service account keys:
credentials, _ = google.auth.default()  # ✓ Recommended
2

Avoid Hardcoded Credentials

Never embed credentials in source code:
# ✗ NEVER do this
credentials = service_account.Credentials.from_service_account_file(
    "/path/to/key.json"
)
3

Use Environment Variables

For service account keys (if required):
export GOOGLE_APPLICATION_CREDENTIALS="/secure/path/to/key.json"

Monitoring Tool Usage

Track tool execution for debugging and optimization:
import logging

# Enable ADK debug logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('google.adk')
logger.setLevel(logging.DEBUG)

# Tool executions will be logged:
# DEBUG:google.adk.tools.bigquery:Executing query: SELECT ...
# DEBUG:google.adk.tools.bigquery:Query completed in 1.23s
Monitor tool performance using BigQuery job history:
-- Check recent queries executed by the agent
SELECT 
  creation_time,
  user_email,
  query,
  total_bytes_processed,
  total_slot_ms
FROM `region-us-east4`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE user_email = 'bigquery-agent@PROJECT_ID.iam.gserviceaccount.com'
ORDER BY creation_time DESC
LIMIT 100

Adding Custom Tools

While the agent currently uses only BigQueryToolset, you can add additional tools.

Example: Adding a Custom Validation Tool

from google.adk.tools import Tool
from typing import Dict, Any
import sqlglot  # Already in requirements.txt

class SQLValidatorTool(Tool):
    """Custom tool for advanced SQL validation"""
    
    def __init__(self):
        super().__init__(
            name="sql_validator",
            description="Validates SQL syntax and checks for best practices"
        )
    
    def execute(self, sql: str) -> Dict[str, Any]:
        """Validate SQL and provide recommendations"""
        try:
            # Parse SQL
            parsed = sqlglot.parse_one(sql, dialect="bigquery")
            
            # Check for SELECT *
            has_select_star = "SELECT *" in sql.upper()
            
            # Check for LIMIT clause
            has_limit = "LIMIT" in sql.upper()
            
            return {
                "valid": True,
                "warnings": [
                    "Avoid SELECT * in production" if has_select_star else None,
                    "Consider adding LIMIT for large tables" if not has_limit else None
                ],
                "parsed_structure": str(parsed)
            }
        except Exception as e:
            return {
                "valid": False,
                "error": str(e)
            }

# Add to agent
sql_validator = SQLValidatorTool()

root_agent = LlmAgent(
    model=LLM_1_MODELO,
    name=LLM_1_NAME,
    description="Agente para responder preguntas sobre datos y modelos de BigQuery",
    instruction=new_instruction,
    tools=[bigquery_toolset, sql_validator]  # ← Add custom tool
)
When adding custom tools, update the instruction prompt to inform the agent about the new capabilities.

Troubleshooting Tool Execution

Symptom: Agent returns SQL without executing itCauses:
  1. Instruction prompt doesn’t mandate tool usage
  2. Tool execution failed silently
  3. Agent doesn’t recognize tool availability
Solutions:
# Ensure instruction mandates tool usage
"""
2. EJECUCIÓN OBLIGATORIA: Para cualquier pregunta de datos, 
   DEBES usar la herramienta `bigquery_toolset` para probar que tu query funciona.
"""

# Verify tool is registered
print(root_agent.tools)  # Should include bigquery_toolset
Symptom: Tool raises error even for SELECT queriesCause: Query might contain write operations in subqueries or CTEsSolution:
-- Check for hidden write operations
-- ✗ This will be blocked:
WITH temp AS (
  INSERT INTO table VALUES (1, 2, 3)
  RETURNING *
)
SELECT * FROM temp

-- ✓ This is allowed:
WITH temp AS (
  SELECT * FROM existing_table
)
SELECT * FROM temp
Symptom: google.auth.exceptions.DefaultCredentialsErrorSolutions:
# Local development
gcloud auth application-default login

# Verify credentials
gcloud auth application-default print-access-token

# Cloud Run deployment
# Ensure service account is attached to the service
gcloud run services describe SERVICE_NAME --region=REGION

Advanced Configuration

Query Timeout Settings

from google.cloud import bigquery

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,
    # Custom timeout (requires additional configuration)
)

# Note: Default timeout is typically sufficient (10 minutes)
# For custom timeouts, configure at the BigQuery client level

Result Size Limits

# Update instruction prompt to guide agents on result limits
new_instruction = f"""
# ... existing prompt ...

<QUERY_GUIDELINES>
  - Para tablas grandes, usa LIMIT para restringir resultados
  - Límite recomendado: LIMIT 1000 para exploraciones iniciales
  - Para análisis completos, el usuario puede remover el LIMIT
</QUERY_GUIDELINES>
"""

Summary

The MABQ BigQuery Agent’s tool configuration provides:
  • Secure Query Execution: WriteMode.BLOCKED prevents data modification
  • Automatic Validation: Mandatory tool usage ensures SQL correctness
  • Credential Management: Automatic authentication via google.auth.default()
  • Schema Discovery: Built-in tools for exploring database structure
  • Error Handling: Graceful failure with security-aware messaging
The combination of BigQueryToolset, WriteMode.BLOCKED, and security-focused instruction prompts creates a robust, production-ready analytics agent.

Build docs developers (and LLMs) love