Agent Configuration

Environment Variables

The agent reads configuration from environment variables with sensible defaults defined in agent.py:10-18.

Project and Dataset Configuration

PROJECT_ID

string

default:"datawarehouse-des"

The Google Cloud project ID containing the BigQuery datasets.

PROJECT_ID = os.getenv("PROJECT_ID", "datawarehouse-des")

BIGQUERY_DATASET

string

default:"STG_ACTIVOS"

The target BigQuery dataset for agent queries.

BIGQUERY_DATASET = os.getenv("BIGQUERY_DATASET", "STG_ACTIVOS")

GOOGLE_CLOUD_LOCATION

string

default:"us-east4"

The Google Cloud region for Vertex AI operations.

GOOGLE_CLOUD_LOCATION = os.getenv("GOOGLE_CLOUD_LOCATION", "us-east4")

NOMBRE_EMPRESA

string

default:"TRANSELEC S.A."

The organization name used in security rejection messages.

NOMBRE_EMPRESA = os.getenv("NOMBRE_EMPRESA", "TRANSELEC S.A.")

Model Configuration

ANALYTICS_AGENT_MODEL

string

default:"gemini-2.5-pro"

Legacy model configuration variable (retained for compatibility).

ANALYTICS_AGENT_MODEL = os.getenv("ANALYTICS_AGENT_MODEL", "gemini-2.5-pro")

LLM_1_NAME

string

default:"bigquery_agent_stg_activos"

The internal name identifier for the agent.

LLM_1_NAME = os.getenv("LLM_1_NAME", "bigquery_agent_stg_activos")

LLM_1_MODELO

string

default:"gemini-2.5-pro"

The Vertex AI model used by the agent. This is the active model configuration.

LLM_1_MODELO = os.getenv("LLM_1_MODELO", "gemini-2.5-pro")

Model Configuration

The agent uses Gemini 2.5 Pro as the default language model.

Supported Models

While gemini-2.5-pro is the default, you can configure any Vertex AI model by setting the LLM_1_MODELO environment variable:

export LLM_1_MODELO="gemini-2.5-pro"

Model Selection Criteria

When choosing a model, consider:

Gemini 2.5 Pro: Best for complex SQL generation and reasoning
Gemini 2.5 Flash: Faster responses, suitable for simpler queries
Gemini 1.5 Pro: Balance of performance and cost

Tool Configuration

The agent’s BigQuery integration is configured with strict security controls.

BigQueryToolConfig

Defined in agent.py:24-26:

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,
)

write_mode

WriteMode

default:"WriteMode.BLOCKED"

Controls write access to BigQuery. Set to WriteMode.BLOCKED to enforce read-only operations.

WriteMode Options

The WriteMode enum provides the following security levels:

Mode	Description	Use Case
`WriteMode.BLOCKED`	Prevents all write operations	Production analytics (current setting)
`WriteMode.ALLOWED`	Permits write operations	Development/testing environments

Security Critical: The agent is configured with WriteMode.BLOCKED to prevent any data modification. Changing this setting could compromise data integrity.

Tool Integration

The configured tool is passed to the agent:

root_agent = LlmAgent(
    model=LLM_1_MODELO, 
    name=LLM_1_NAME,
    description="Agente para responder preguntas sobre datos y modelos de BigQuery",
    instruction=new_instruction,
    tools=[bigquery_toolset]  # ← Tool configuration applied here
)

Customizing the Instruction Prompt

The agent’s behavior is primarily controlled by the instruction prompt. You can customize it by modifying the new_instruction variable in agent.py:38-68.

Current Instruction Structure

Agent Role Definition

Defines the agent as a SQL generation engine:

new_instruction = f"""
Eres un motor de generación de SQL para BigQuery.
Tu ÚNICO objetivo es traducir lenguaje natural a código SQL válido...
"""

Security Guardrails

Specifies prohibited commands and rejection behavior:

<SECURITY_GUARDRAILS>
  1. MODO ESTRICTO: READ-ONLY.
  2. COMANDOS PROHIBIDOS: DROP, DELETE, UPDATE, INSERT...
</SECURITY_GUARDRAILS>

Operational Instructions

Defines tool usage and output format requirements:

<INSTRUCTIONS>
  - Tienes acceso a `bigquery_toolset`.
  - Tu prioridad absoluta es la sintaxis correcta...
</INSTRUCTIONS>

Customization Examples

Add Industry-Specific Context

new_instruction = f"""
Eres un motor de generación de SQL para BigQuery especializado en datos del sector eléctrico.
Tu ÚNICO objetivo es traducir lenguaje natural a código SQL válido para el proyecto **{PROJECT_ID}**, dataset **{BIGQUERY_DATASET}**.

Contexto del dominio:
- Los datos provienen de operaciones de transmisión eléctrica
- Las tablas contienen información de activos, mantenimiento y operaciones
- Los usuarios son analistas de ingeniería y operaciones

<SECURITY_GUARDRAILS>
  # ... rest of the prompt
"""

Modify Output Format

new_instruction = f"""
# ... existing prompt sections ...

FORMATO DE RESPUESTA ACEPTADO:
```sql
-- Query generated for: [user question]
SELECT ...

Incluye un comentario breve con la pregunta del usuario. """

</Accordion>

<Accordion title="Add Query Optimization Guidance">
```python
new_instruction = f"""
# ... existing prompt sections ...

<OPTIMIZATION_RULES>
  - Usa particiones de fecha cuando estén disponibles
  - Limita resultados con LIMIT cuando sea apropiado
  - Prefiere agregaciones sobre datos completos
  - Evita SELECT * en producción
</OPTIMIZATION_RULES>
"""

Configuration Best Practices

Use Environment Variables

Never hardcode configuration values. Always use environment variables for:

Project IDs and dataset names
Model selection
Region configuration
Organization names

Maintain Security Defaults

Keep WriteMode.BLOCKED in production environments:

tool_config = BigQueryToolConfig(
    write_mode=WriteMode.BLOCKED,  # ← Never change this in production
)

Test Instruction Changes

When modifying the instruction prompt:

Test with diverse query types
Verify security guardrails still work
Ensure output format remains consistent
Check that tool usage is still correct

Document Custom Configuration

If you modify defaults, document:

What was changed and why
Expected behavior differences
Any new environment variables

Example Configuration File

For deployment, create a .env file:

# Project Configuration
PROJECT_ID=datawarehouse-prod
BIGQUERY_DATASET=STG_ACTIVOS
GOOGLE_CLOUD_LOCATION=us-east4
NOMBRE_EMPRESA=TRANSELEC S.A.

# Model Configuration
LLM_1_NAME=bigquery_agent_stg_activos
LLM_1_MODELO=gemini-2.5-pro

Load environment variables using python-dotenv (already included in requirements.txt).

Get Started

Agent Development

Frontend

Deployment

Security

Environment Variables

Project and Dataset Configuration

Model Configuration

Model Configuration

Supported Models

Model Selection Criteria

Tool Configuration

BigQueryToolConfig

WriteMode Options

Tool Integration

Customizing the Instruction Prompt

Current Instruction Structure

Customization Examples

Configuration Best Practices

Example Configuration File

Build docs developers (and LLMs) love

Get Started

Agent Development

Frontend

Deployment

Security

​Environment Variables

​Project and Dataset Configuration

​Model Configuration

​Model Configuration

​Supported Models

​Model Selection Criteria

​Tool Configuration

​BigQueryToolConfig

​WriteMode Options

​Tool Integration

​Customizing the Instruction Prompt

​Current Instruction Structure

​Customization Examples

​Configuration Best Practices

​Example Configuration File

Build docs developers (and LLMs) love

Environment Variables

Project and Dataset Configuration

Model Configuration

Model Configuration

Supported Models

Model Selection Criteria

Tool Configuration

BigQueryToolConfig

WriteMode Options

Tool Integration

Customizing the Instruction Prompt

Current Instruction Structure

Customization Examples

Configuration Best Practices

Example Configuration File