Skip to main content

Architecture Overview

Aurora is built as a microservices architecture using Docker Compose. The system consists of multiple components that work together to provide a natural-language interface for cloud infrastructure management.

System Components

Backend Services

Aurora Server (Flask API)

  • Location: server/
  • Entry Point: main_compute.py
  • Port: 5080
  • Framework: Flask 3.1.3
  • Purpose: REST API for all compute operations, user management, and cloud provider integrations

Chatbot WebSocket Server

  • Location: server/
  • Entry Point: main_chatbot.py
  • Port: 5006
  • Protocol: WebSocket
  • Purpose: Real-time conversational AI interface powered by LangGraph

Celery Worker

  • Location: server/
  • Purpose: Background task processing for long-running operations
  • Broker: Redis
  • Tasks: Cloud resource discovery, billing updates, infrastructure provisioning

Celery Beat

  • Purpose: Periodic task scheduler
  • Tasks: Scheduled billing updates, resource synchronization

Frontend

  • Location: client/
  • Framework: Next.js 15
  • Language: TypeScript
  • UI Components: shadcn/ui (Radix UI primitives)
  • Styling: Tailwind CSS
  • Authentication: Auth.js (NextAuth v5 beta)
  • Port: 3000

Data Layer

PostgreSQL

  • Port: 5432
  • Database: aurora_db
  • Purpose: Primary relational database for users, projects, infrastructure state
  • Driver: psycopg2 (backend)

Weaviate

  • Port: 8080
  • Purpose: Vector database for semantic search and RAG (Retrieval-Augmented Generation)
  • Use Cases: Knowledge base search, context retrieval for AI agent

Redis

  • Port: 6379
  • Purpose: Message broker for Celery, caching layer

Infrastructure Services

HashiCorp Vault

  • Port: 8200
  • Purpose: Secrets management for cloud provider credentials, API keys
  • Storage: File-based with Docker volumes (vault-data, vault-init)
  • Mount: KV v2 engine at aurora mount
  • Auto-initialization: vault-init container handles setup and unsealing

SeaweedFS

  • S3 API Port: 8333
  • File Browser: 8888
  • Cluster Status: 9333
  • Purpose: S3-compatible object storage for Terraform state files, artifacts
  • License: Apache 2.0
  • Alternatives: AWS S3, Cloudflare R2, MinIO, GCS (S3 interop)

Tech Stack

Backend Technologies

# Core Framework
Flask==3.1.3
Flask-Cors==6.0.0
flask-jwt-extended>=4.4.4

# AI/ML Stack
langchain==1.2.6
langchain_community==0.3.31
langchain_core==1.2.11
langchain_openai==1.1.7
langchain-anthropic>=0.3.0
langchain-google-genai>=2.0.0
langgraph==1.0.6

# LLM Providers
openai>=1.109.1,<3.0.0
anthropic>=0.18.0

# Cloud Providers
boto3==1.36.26  # AWS
google-cloud-* # GCP
azure-identity==1.16.1  # Azure
azure-mgmt-resource==23.0.1
ovh==1.1.0  # OVH Cloud

# Background Tasks
celery==5.3.6
redis==5.0.1

# Database
psycopg2-binary
weaviate_client>=4.15.0

# Secrets Management
hvac>=2.1.0  # Vault client

# Infrastructure as Code
kubernetes==26.1.0
paramiko>=3.4.0  # SSH

# Web Server
gunicorn==23.0.0
gevent==25.9.1

Frontend Technologies

{
  "framework": "next@^15.5.12",
  "react": "^18.2.0",
  "typescript": "5.7.2",
  
  "ui-components": [
    "@radix-ui/react-*",
    "shadcn/ui",
    "lucide-react"
  ],
  
  "styling": [
    "tailwindcss@^3.4.1",
    "tailwindcss-animate",
    "class-variance-authority"
  ],
  
  "authentication": "next-auth@^5.0.0-beta.30",
  
  "visualization": [
    "recharts@^2.15.0",
    "@xyflow/react@^12.10.0"
  ],
  
  "code-editor": "@monaco-editor/react@^4.7.0"
}

Project Structure

Backend Structure (server/)

server/
├── main_compute.py          # Flask API entry point
├── main_chatbot.py          # WebSocket chatbot entry point
├── celery_config.py         # Celery task configuration
├── requirements.txt         # Python dependencies

├── routes/                  # Flask blueprints (API endpoints)
│   ├── aws/                # AWS-specific routes
│   ├── azure/              # Azure-specific routes
│   ├── gcp/                # GCP-specific routes
│   ├── github/             # GitHub integration
│   ├── slack/              # Slack integration
│   ├── terraform/          # Terraform operations
│   └── ...

├── chat/                    # AI chatbot implementation
│   └── backend/
│       └── agent/
│           ├── agent.py    # Main agent logic
│           ├── workflow.py # LangGraph workflow
│           ├── llm.py      # LLM configuration
│           ├── tools/      # Agent tools
│           ├── prompt/     # Prompt templates
│           └── providers/  # LLM provider implementations

├── connectors/              # Cloud provider connectors
│   ├── aws_connector/
│   ├── azure_connector/
│   ├── gcp_connector/
│   ├── github_connector/
│   ├── slack_connector/
│   └── ...

├── utils/                   # Utility modules
│   ├── auth/               # Authentication utilities
│   ├── db/                 # Database utilities
│   ├── secrets/            # Vault integration
│   ├── storage/            # S3-compatible storage
│   ├── terraform/          # Terraform helpers
│   ├── kubectl/            # Kubernetes utilities
│   └── billing/            # Cost tracking

├── services/                # Business logic services
│   ├── discovery/          # Resource discovery
│   ├── correlation/        # Resource correlation
│   └── graph/              # Graph database operations

└── tests/                   # Test suite

Frontend Structure (client/)

client/
├── src/
│   ├── app/                # Next.js 15 App Router
│   │   ├── api/           # API routes (Next.js API)
│   │   │   ├── auth/      # Authentication endpoints
│   │   │   ├── aws-*/     # AWS proxy endpoints
│   │   │   ├── azure-*/   # Azure proxy endpoints
│   │   │   └── gcp-*/     # GCP proxy endpoints
│   │   ├── (dashboard)/   # Dashboard layouts
│   │   ├── chat/          # Chat interface
│   │   └── layout.tsx     # Root layout
│   │
│   ├── components/         # React components
│   │   ├── ui/            # shadcn/ui components
│   │   ├── chat/          # Chat-specific components
│   │   ├── dashboard/     # Dashboard widgets
│   │   └── providers/     # Cloud provider components
│   │
│   ├── lib/               # Utility libraries
│   │   ├── auth.ts        # Auth.js configuration
│   │   └── utils.ts       # Helper functions
│   │
│   └── types/             # TypeScript type definitions

├── public/                # Static assets
├── next.config.ts         # Next.js configuration
├── tailwind.config.ts     # Tailwind CSS configuration
└── tsconfig.json          # TypeScript configuration

Data Flow

Chat Workflow

  1. User Input: User sends message via WebSocket from frontend
  2. WebSocket Server: main_chatbot.py receives message
  3. LangGraph Agent: Message flows through LangGraph workflow
  4. Tool Execution: Agent calls cloud provider tools, database queries
  5. LLM Processing: OpenRouter/Anthropic/OpenAI generates response
  6. Streaming Response: Response streamed back to frontend via WebSocket

Infrastructure Provisioning

  1. User Request: User requests infrastructure via chat or UI
  2. Agent Analysis: LangGraph agent analyzes requirements
  3. Terraform Generation: Agent generates Terraform configurations
  4. State Storage: Terraform state stored in SeaweedFS
  5. Approval Flow: User confirms changes via WebSocket
  6. Celery Task: Background task executes Terraform apply
  7. Result Notification: User notified of completion

Authentication Flow

  1. Login: User submits credentials to Next.js API route
  2. Auth.js: NextAuth validates credentials against PostgreSQL
  3. JWT Token: Stateless JWT token issued (flask-jwt-extended)
  4. Session: Session stored in secure HTTP-only cookie
  5. API Requests: JWT included in Authorization header
  6. Validation: Flask middleware validates JWT on each request

Agent Architecture (LangGraph)

Aurora uses LangGraph for orchestrating the AI agent workflow:
# Simplified workflow structure
from langgraph.graph import StateGraph

workflow = StateGraph(State)

# Nodes
workflow.add_node("agent", agent_node)
workflow.add_node("tools", tool_execution_node)
workflow.add_node("human_approval", approval_node)

# Edges
workflow.add_edge("agent", "tools")
workflow.add_conditional_edges(
    "tools",
    should_continue,
    {
        "continue": "agent",
        "approval_required": "human_approval",
        "end": END
    }
)

Agent State

class State(TypedDict):
    messages: List[BaseMessage]  # Conversation history
    user_id: str                 # Current user
    session_id: str              # Chat session ID
    cloud_context: Dict          # Active cloud connections
    pending_approval: Optional[Dict]  # Infrastructure changes awaiting approval

Agent Tools

The agent has access to various tools:
  • Cloud Provider Tools: List resources, create/modify/delete resources
  • Database Tools: Query infrastructure state, user data
  • Terraform Tools: Generate, validate, apply IaC
  • Knowledge Base Tools: Search documentation, retrieve context
  • Billing Tools: Get cost estimates, analyze spending

Secrets Management

Aurora uses HashiCorp Vault for secure secrets storage:

Secret References

Secrets stored in the database use a special format:
# Example: Cloud provider token in database
user_credential = "vault:kv/data/aurora/users/aws-token-user123"

# At runtime, resolved via Vault API
from utils.secrets.vault_client import get_secret
actual_token = get_secret(user_credential)

Vault Structure

aurorount)/
└── users/
    ├── aws-token-{user_id}
    ├── gcp-sa-{user_id}
    ├── azure-token-{user_id}
    └── ...

Storage Architecture

Aurora uses S3-compatible storage via SeaweedFS:
from utils.storage.storage import get_storage_manager

storage = get_storage_manager()

# Upload Terraform state
storage.upload_file(
    bucket="terraform-state",
    key=f"projects/{project_id}/terraform.tfstate",
    file_path="/tmp/terraform.tfstate"
)

# Download state
storage.download_file(
    bucket="terraform-state",
    key=f"projects/{project_id}/terraform.tfstate",
    destination="/tmp/terraform.tfstate"
)

Code Style Guidelines

Python (Backend)

  • Naming: snake_case for functions, variables, files
  • Imports: Group imports (stdlib, third-party, local)
  • Error Handling: Use try/except with logging
  • Async: Use async/await with langchain/langgraph
  • Logging: Use logging.INFO level, no emojis in logs
  • Database: Use connection pooling via db_pool
  • Routes: Organize as Flask blueprints in routes/

TypeScript (Frontend)

  • Naming: camelCase for variables/functions, PascalCase for components
  • Imports: Use path alias @/* for ./src/*
  • Components: Functional components with TypeScript
  • Hooks: Follow React hooks best practices
  • Error Handling: Use try/catch with user-friendly messages
  • Styling: Use Tailwind CSS utility classes
  • URLs: Use kebab-case for routes

General

  • Keep functions small and focused
  • Avoid deep nesting
  • Write self-documenting code
  • Add comments for complex logic only
  • No commented-out code in commits
  • No emojis in code or logs

Configuration

Docker Compose Files

  • docker-compose.yaml: Development environment
  • docker-compose.prod-local.yml: Production builds for local testing
Important: Always update both files together to keep environment variables in sync.

Environment Variables

Configuration is managed via .env file. See .env.example for all available options. Key configuration areas:
  • Database credentials
  • LLM API keys
  • Cloud provider credentials (or use Vault)
  • Service URLs and ports
  • Feature flags

Next Steps

Setup Guide

Set up your development environment

Contributing

Learn how to contribute to Aurora

Testing

Write and run tests

Build docs developers (and LLMs) love