Overview
This document traces data flows for key operations in Aurora, from user interaction to backend processing and storage.Authentication Flow
OAuth 2.0 Flow (GCP Example)
Key Files:- OAuth URL:
server/connectors/gcp_connector/auth/oauth.py:13-31 - Token exchange:
server/connectors/gcp_connector/auth/oauth.py:34-71 - Vault storage:
server/utils/vault/vault_client.py - Post-auth tasks:
server/connectors/gcp_connector/gcp_post_auth_tasks.py
- Frontend requests OAuth URL from Flask API
- Flask generates authorization URL with scopes
- User approves permissions in GCP consent screen
- GCP redirects back with authorization code
- Flask exchanges code for access/refresh tokens
- Vault stores tokens securely (never in database)
- Postgres stores reference:
vault:kv/data/aurora/users/{user_id}/gcp_token - Celery runs background task to fetch initial project list
- Frontend receives success and updates UI
Chat Request Flow
AI Agent Query Processing
Key Files:- WebSocket handler:
server/main_chatbot.py:604-1000 - Workflow execution:
server/chat/backend/agent/workflow.py:59-113 - Agent logic:
server/chat/backend/agent/agent.py:155-400 - Tool execution:
server/chat/backend/agent/tools/cloud_tools.py
- User types question in chat interface
- Frontend sends WebSocket message with query, session_id, user_id
- WebSocket server receives message and initializes LangGraph state
- Agent loads conversation history from PostgreSQL
- Agent builds dynamic system prompt based on connected providers
- LLM streams response tokens back through WebSocket
- Frontend renders tokens in real-time
- LLM decides to call
list_gke_clusterstool - Agent executes tool with user context
- Tool calls GCP API with credentials from Vault
- Tool returns structured result to agent
- Agent sends result back to LLM
- LLM generates final answer incorporating tool data
- WebSocket streams final answer to frontend
- Agent saves full conversation to PostgreSQL
- Agent indexes conversation in Weaviate for future retrieval
Infrastructure Deployment Flow
Terraform Resource Creation
Key Files:- IaC write:
server/chat/backend/agent/tools/iac/iac_write_tool.py - IaC deploy:
server/chat/backend/agent/tools/iac/iac_deploy_tool.py - Confirmations:
server/utils/cloud/infrastructure_confirmation.py - Storage:
server/utils/storage/storage.py
- User requests infrastructure (“Create a GKE cluster”)
- Agent asks LLM to plan the operation
- LLM calls
iac_writetool to generate Terraform code - IaC Tool writes
.tffiles to session-isolated directory in SeaweedFS - Agent displays Terraform code to user for review
- LLM calls
iac_deploytool to apply changes - IaC Tool requests user confirmation via WebSocket
- Frontend shows confirmation dialog with resource details
- User approves or rejects deployment
- Frontend sends
confirmation_responseback - IaC Tool executes
terraform init,terraform plan,terraform apply - Terraform creates resources in GCP/AWS/Azure
- IaC Tool saves Terraform state to SeaweedFS
- IaC Tool logs deployment metadata to PostgreSQL
- Agent streams success message to frontend
Knowledge Base Ingestion Flow
Document Upload & Indexing
Key Files:- Upload endpoint:
server/routes/knowledge_base.py - Indexing task:
server/routes/knowledge_base/tasks.py - Storage manager:
server/utils/storage/storage.py - Weaviate client:
server/chat/backend/agent/weaviate_client.py
- User selects PDF/DOCX file and clicks Upload
- Frontend sends multipart form data to Flask API
- Flask validates file type and size (max 100MB)
- Storage saves file to SeaweedFS with key
{user_id}/kb/{doc_id}.pdf - Postgres creates record in
kb_documentswith status=‘processing’ - Celery picks up indexing task from Redis queue
- Celery downloads file from Storage
- Celery extracts text using PyPDF2/python-docx
- Celery splits text into 500-token chunks with 50-token overlap
- Weaviate generates embeddings using
all-MiniLM-L6-v2model - Weaviate stores vectors in
UserKnowledgecollection - Postgres updates document status to ‘indexed’
- Frontend receives SSE update and shows “Ready” badge
Incident Detection & RCA Flow
Background Chat Analysis
Key Files:- Incident creation:
server/routes/incidents_routes.py - Background task:
server/chat/background/task.py - RCA workflow:
server/chat/backend/agent/agent.py - Thought saving:
server/main_chatbot.py:229-286
- Monitoring system (Grafana/Datadog) sends webhook on alert
- Flask creates incident record in PostgreSQL
- Celery receives background RCA task from Redis
- Celery sends internal HTTP request to Chatbot service
- Chatbot initializes LangGraph workflow with incident context
- Agent instructs LLM to investigate the alert
- LLM decides to gather evidence:
- Calls
get_gcp_logsfor recent error logs - Calls
github_search_codefor recent deployments - Calls
splunk_searchfor metrics and traces
- Calls
- Agent saves incremental “thoughts” to
incident_thoughtstable - LLM synthesizes evidence into root cause analysis
- Agent updates
incidentstable with RCA summary - Chatbot returns success to Celery task
- Frontend polls for incident updates and displays RCA
Service Discovery Flow
Periodic Resource Scanning
Key Files:- Beat schedule:
server/celery_config.py:79-82 - Discovery tasks:
server/services/discovery/tasks.py - Memgraph client:
server/services/discovery/memgraph_client.py
- Celery Beat triggers
run_full_discoverytask every hour - Celery queries PostgreSQL for all users with connected providers
- For each user:
- Retrieve credentials from Vault for each provider
- Call cloud provider APIs to list resources:
- GCP: Projects, GKE clusters, Compute instances, Cloud SQL
- AWS: EC2 instances, EKS clusters, RDS databases, Lambda functions
- Azure: VMs, AKS clusters, SQL databases, App Services
- Celery creates graph nodes in Memgraph:
- Celery creates relationships:
- Celery caches service list in PostgreSQL
graph_servicestable - Frontend queries graph for visualization
Secret Storage Flow
Vault Secret Lifecycle
Key Files:- Vault client:
server/utils/vault/vault_client.py - Token management:
server/utils/auth/token_management.py
- Never store secrets in database - Only references
- Centralized secret rotation - Update in one place
- Audit trail - Vault logs all access
- Encryption at rest - Vault encrypts data
File Storage Flow
SeaweedFS S3 Operations
Key Files:- Storage abstraction:
server/utils/storage/storage.py - S3 configuration:
config/seaweedfs/s3.json
Cache Flow
Redis Cache Patterns
Cache Keys:server/utils/billing/billing_cache.py