Overview
The MABQ BigQuery Agent is built using Google’s Agent Development Kit (ADK) and provides a read-only conversational interface to BigQuery data.
Agent Configuration
The agent is defined in data_agente/agent.py and configured with Vertex AI and BigQuery tools.
LlmAgent Definition
from google.adk.agents.llm_agent import LlmAgent
root_agent = LlmAgent(
model=LLM_1_MODELO,
name=LLM_1_NAME,
description="Agente para responder preguntas sobre datos y modelos de BigQuery",
instruction=new_instruction,
tools=[bigquery_toolset]
)
model
string
default:"gemini-2.5-pro"
Vertex AI model used for natural language understanding and SQL generation
name
string
default:"bigquery_agent_stg_activos"
Identifier for the agent instance
Agent description: “Agente para responder preguntas sobre datos y modelos de BigQuery”
Array containing the BigQueryToolset for database access
Agent Instructions
The agent follows strict security guardrails and response formatting:
new_instruction = f"""
Eres un motor de generación de SQL para BigQuery.
Tu ÚNICO objetivo es traducir lenguaje natural a código SQL válido para el proyecto **{PROJECT_ID}**, dataset **{BIGQUERY_DATASET}**.
<SECURITY_GUARDRAILS>
1. MODO ESTRICTO: READ-ONLY.
2. COMANDOS PROHIBIDOS: DROP, DELETE, UPDATE, INSERT, CREATE, ALTER, TRUNCATE, MERGE, GRANT, REVOKE.
3. COMPORTAMIENTO: Rechazar cualquier solicitud de modificación de datos.
</SECURITY_GUARDRAILS>
<INSTRUCTIONS>
- Tienes acceso a `bigquery_toolset`.
- Prioridad absoluta: sintaxis correcta y nombres de tabla reales.
</INSTRUCTIONS>
"""
Security Guardrails
The agent is hardcoded to READ-ONLY mode and will reject any DML/DDL operations:
- DROP, DELETE, UPDATE, INSERT
- CREATE, ALTER, TRUNCATE
- MERGE, GRANT, REVOKE
If a user requests data modification, the agent responds:
“Lo siento, por seguridad corporativa tengo acceso de solo lectura a los datos de TRANSELEC S.A.”
The agent is instructed to return only SQL code blocks for data queries:
SELECT column1, column2
FROM `project.dataset.table`
WHERE condition = true
LIMIT 10
The agent is instructed NOT to include explanatory text - only the SQL query.
The agent uses the ADK BigQueryToolset with restricted write permissions:
from google.adk.tools.bigquery import BigQueryCredentialsConfig, BigQueryToolset
from google.adk.tools.bigquery.config import BigQueryToolConfig, WriteMode
tool_config = BigQueryToolConfig(
write_mode=WriteMode.BLOCKED,
)
credentials, _ = google.auth.default()
credentials_config = BigQueryCredentialsConfig(credentials=credentials)
bigquery_toolset = BigQueryToolset(
credentials_config=credentials_config,
bigquery_tool_config=tool_config
)
write_mode
WriteMode
default:"WriteMode.BLOCKED"
Prevents all write operations at the tool level
Uses Google Cloud default credentials (service account in Cloud Run)
Getting the Agent
The get_root_agent() function returns the configured agent instance:
def get_root_agent():
return root_agent
This function is imported in main.py to initialize the ADK wrapper:
from data_agente.agent import get_root_agent
agente_backend = get_root_agent()
ADKAgent Wrapper
The agent is wrapped with ADKAgent for integration with FastAPI and CopilotKit:
from ag_ui_adk import ADKAgent
adk_wrapper = ADKAgent(
adk_agent=agente_backend,
app_name="transelec-mabq",
user_id="verified-user",
use_in_memory_services=True
)
The underlying LlmAgent instance
app_name
string
default:"transelec-mabq"
Application identifier for session management
user_id
string
default:"verified-user"
User identifier (static in this implementation)
Enables in-memory session storage instead of external persistence
Environment Variables
PROJECT_ID
string
default:"datawarehouse-des"
Google Cloud project containing the BigQuery dataset
BIGQUERY_DATASET
string
default:"STG_ACTIVOS"
BigQuery dataset the agent has access to
Region for Vertex AI and BigQuery operations
NOMBRE_EMPRESA
string
default:"TRANSELEC S.A."
Company name used in agent responses
ANALYTICS_AGENT_MODEL
string
default:"gemini-2.5-pro"
Vertex AI model for the analytics agent
LLM_1_NAME
string
default:"bigquery_agent_stg_activos"
Agent name identifier
LLM_1_MODELO
string
default:"gemini-2.5-pro"
Model version for the LlmAgent
Session Management
The agent uses in-memory session services for conversation state:
from google.adk.sessions.in_memory_session_service import InMemorySessionService
in_memory_session_service = InMemorySessionService()
In-memory sessions are lost on container restart. For production, consider using persistent session storage.
Example Usage
Programmatic Access
from data_agente.agent import get_root_agent
agent = get_root_agent()
# The agent is now ready to process queries
# (typically used via ADK endpoint, not called directly)
Query Flow
- User sends natural language query
- Agent analyzes query with Gemini model
- Agent generates SQL using BigQueryToolset
- Agent validates SQL syntax
- Agent executes query (read-only)
- Agent returns SQL code block to user