Skip to main content

Overview

The MABQ BigQuery Agent is built using Google’s Agent Development Kit (ADK) and provides a read-only conversational interface to BigQuery data.

Agent Configuration

The agent is defined in data_agente/agent.py and configured with Vertex AI and BigQuery tools.

LlmAgent Definition

from google.adk.agents.llm_agent import LlmAgent

root_agent = LlmAgent(
    model=LLM_1_MODELO, 
    name=LLM_1_NAME,
    description="Agente para responder preguntas sobre datos y modelos de BigQuery",
    instruction=new_instruction,
    tools=[bigquery_toolset]
)
model
string
default:"gemini-2.5-pro"
Vertex AI model used for natural language understanding and SQL generation
name
string
default:"bigquery_agent_stg_activos"
Identifier for the agent instance
description
string
Agent description: “Agente para responder preguntas sobre datos y modelos de BigQuery”
instruction
string
System prompt defining agent behavior (see Agent Instructions)
tools
list
Array containing the BigQueryToolset for database access

Agent Instructions

The agent follows strict security guardrails and response formatting:
new_instruction = f"""
Eres un motor de generación de SQL para BigQuery.
Tu ÚNICO objetivo es traducir lenguaje natural a código SQL válido para el proyecto **{PROJECT_ID}**, dataset **{BIGQUERY_DATASET}**.

<SECURITY_GUARDRAILS>
  1. MODO ESTRICTO: READ-ONLY.
  2. COMANDOS PROHIBIDOS: DROP, DELETE, UPDATE, INSERT, CREATE, ALTER, TRUNCATE, MERGE, GRANT, REVOKE.
  3. COMPORTAMIENTO: Rechazar cualquier solicitud de modificación de datos.
</SECURITY_GUARDRAILS>

<INSTRUCTIONS>
  - Tienes acceso a `bigquery_toolset`.
  - Prioridad absoluta: sintaxis correcta y nombres de tabla reales.
</INSTRUCTIONS>
"""

Security Guardrails

The agent is hardcoded to READ-ONLY mode and will reject any DML/DDL operations:
  • DROP, DELETE, UPDATE, INSERT
  • CREATE, ALTER, TRUNCATE
  • MERGE, GRANT, REVOKE
If a user requests data modification, the agent responds:
“Lo siento, por seguridad corporativa tengo acceso de solo lectura a los datos de TRANSELEC S.A.”

Response Format

The agent is instructed to return only SQL code blocks for data queries:
SELECT column1, column2
FROM `project.dataset.table`
WHERE condition = true
LIMIT 10
The agent is instructed NOT to include explanatory text - only the SQL query.

BigQuery Toolset

The agent uses the ADK BigQueryToolset with restricted write permissions:
from google.adk.tools.bigquery import BigQueryCredentialsConfig, BigQueryToolset
from google.adk.tools.bigquery.config import BigQueryToolConfig, WriteMode

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,
)

credentials, _ = google.auth.default()
credentials_config = BigQueryCredentialsConfig(credentials=credentials)

bigquery_toolset = BigQueryToolset(
    credentials_config=credentials_config, 
    bigquery_tool_config=tool_config
)
write_mode
WriteMode
default:"WriteMode.BLOCKED"
Prevents all write operations at the tool level
credentials
Credentials
Uses Google Cloud default credentials (service account in Cloud Run)

Getting the Agent

The get_root_agent() function returns the configured agent instance:
def get_root_agent():
    return root_agent
This function is imported in main.py to initialize the ADK wrapper:
from data_agente.agent import get_root_agent

agente_backend = get_root_agent()

ADKAgent Wrapper

The agent is wrapped with ADKAgent for integration with FastAPI and CopilotKit:
from ag_ui_adk import ADKAgent

adk_wrapper = ADKAgent(
    adk_agent=agente_backend, 
    app_name="transelec-mabq", 
    user_id="verified-user",
    use_in_memory_services=True
)
adk_agent
LlmAgent
The underlying LlmAgent instance
app_name
string
default:"transelec-mabq"
Application identifier for session management
user_id
string
default:"verified-user"
User identifier (static in this implementation)
use_in_memory_services
bool
default:"true"
Enables in-memory session storage instead of external persistence

Environment Variables

PROJECT_ID
string
default:"datawarehouse-des"
Google Cloud project containing the BigQuery dataset
BIGQUERY_DATASET
string
default:"STG_ACTIVOS"
BigQuery dataset the agent has access to
GOOGLE_CLOUD_LOCATION
string
default:"us-east4"
Region for Vertex AI and BigQuery operations
NOMBRE_EMPRESA
string
default:"TRANSELEC S.A."
Company name used in agent responses
ANALYTICS_AGENT_MODEL
string
default:"gemini-2.5-pro"
Vertex AI model for the analytics agent
LLM_1_NAME
string
default:"bigquery_agent_stg_activos"
Agent name identifier
LLM_1_MODELO
string
default:"gemini-2.5-pro"
Model version for the LlmAgent

Session Management

The agent uses in-memory session services for conversation state:
from google.adk.sessions.in_memory_session_service import InMemorySessionService

in_memory_session_service = InMemorySessionService()
In-memory sessions are lost on container restart. For production, consider using persistent session storage.

Example Usage

Programmatic Access

from data_agente.agent import get_root_agent

agent = get_root_agent()

# The agent is now ready to process queries
# (typically used via ADK endpoint, not called directly)

Query Flow

  1. User sends natural language query
  2. Agent analyzes query with Gemini model
  3. Agent generates SQL using BigQueryToolset
  4. Agent validates SQL syntax
  5. Agent executes query (read-only)
  6. Agent returns SQL code block to user

Build docs developers (and LLMs) love