Overview
SIAA v2.1.25 is configured through a combination of constants defined insiaa_proxy.py and environment variables. This page covers all configuration parameters and their recommended values.
Core Configuration Constants
Ollama Connection
These settings control how SIAA connects to the Ollama AI service:Base URL for the Ollama API endpoint. Must be accessible from the SIAA server.
Ollama model identifier. The 3B parameter model provides optimal balance between speed and quality for Spanish judicial text.
SIAA system version. Displayed in status endpoint and startup banner.
File Paths
Root directory containing source documents (.md and .txt files). Subdirectories are treated as separate collections.
Path to the quality monitoring log file. Contains one JSON record per query.
Document Processing
These parameters control how documents are chunked and processed:Maximum size of each text chunk in characters. Larger values provide more context but increase processing time.
Number of overlapping characters between consecutive chunks. Prevents splitting articles or procedures across chunk boundaries.
Maximum chunks sent to the AI model per document. Automatically increased to 4 for documents with >80 chunks.
- Francotirador mode (ratio ≥3.0): 1 chunk ≈ 800 chars
- Binóculo mode (ratio ≥1.8): 2 chunks ≈ 1,600 chars
- Escopeta mode (ratio <1.8): 3 chunks ≈ 2,400 chars
Maximum number of documents retrieved per query. Set to 1 when specific document patterns (PSAA, PCSJA, acuerdo) are detected.
Cache Configuration
Maximum number of responses stored in cache. Uses LRU (Least Recently Used) eviction policy.
Time-to-live for cache entries in seconds. Default is 1 hour (3600s). Expired entries are automatically removed.
When
true, only document-based queries are cached. Conversational queries (greetings, etc.) are never cached.Cache hit rate of 30-40% is expected across 26 judicial offices making similar queries. Cache hits return responses in ~5ms vs ~44s without cache.
Network & Timeout Settings
Timeout for establishing connection to Ollama in seconds.
Maximum time to wait for Ollama response in seconds (3 minutes).
Timeout for Ollama health check requests in seconds.
Maximum number of concurrent requests to Ollama. Prevents resource exhaustion on the AI service.
Thread pool size for the Waitress WSGI server. Controls maximum concurrent client connections.
Logging Configuration
Maximum number of entries in the quality log before automatic rotation. When limit is reached, only the most recent 4,000 entries are kept.
Environment Variables
SIAA reads optional environment variables for deployment-specific configuration:Real IP address of the SIAA server. Used for generating document links in responses. If empty, falls back to
request.host (may fail behind reverse proxy).Port exposed to browsers. Use
"80" if behind Nginx reverse proxy, "5000" for direct access.Example: Setting Environment Variables
Flask Application Configuration
SIAA runs as a Flask application with CORS enabled:Directory Structure Requirements
Document Source Directory
TheCARPETA_FUENTES directory should follow this structure:
- Root-level files belong to the “general” collection
- Subdirectories create named collections (e.g., “normativa”)
- Only
.mdand.txtfiles are processed - File encoding should be UTF-8
Log Directory
Create the log directory with appropriate permissions:Performance Tuning
Optimizing for Response Speed
For faster responses (target <30s):Optimizing for Accuracy
For more comprehensive responses:Configuration Verification
Verify your configuration at startup by checking the banner output:Next Steps
Monitoring
Learn how to monitor system health and performance
Log Analysis
Analyze quality logs and query performance