Skip to main content

Introduction to Umbra

Umbra is Concrete Security’s Confidential AI platform—a production system that routes sensitive documents into trusted execution environments (TEEs) for processing. The platform combines a Next.js frontend, Python backend services running in Confidential Virtual Machines (CVMs) on Phala Cloud, and a comprehensive monitoring stack.

What is Confidential AI?

Confidential AI ensures that your sensitive data remains private and secure even while being processed by AI models. Unlike traditional cloud AI services, Umbra uses Intel TDX (Trust Domain Extensions) to create isolated execution environments where:
  • Data is encrypted in memory during processing
  • The hosting provider cannot access your data
  • You can cryptographically verify the integrity of the execution environment
  • AI model processing happens inside hardware-enforced security boundaries

Key Features

Remote Attestation TLS

Client-side verification of Intel TDX attestation using DCAP (Data Center Attestation Primitives). No server trust required—your browser cryptographically verifies the TEE before sending data.

End-to-End Encryption

Secure communication channels using RA-TLS (Remote Attestation TLS) and EKM (Exported Key Material) channel binding with HMAC-SHA256 for TLS security.

Hardware-Enforced Security

Processing runs inside Intel TDX Confidential VMs on Phala Cloud, providing hardware-level isolation that protects against both external attackers and infrastructure providers.

Streaming AI Chat

Real-time streaming chat interface with support for document uploads (text and PDFs), reasoning panels, and OpenAI-compatible API proxy for vLLM integration.

How It Works

Umbra establishes a cryptographically verified connection between your browser and AI models running in a TEE:
1

Submit your request

Enter your prompt and upload sensitive documents through the web interface
2

Establish secure connection

Your browser connects to the TEE server using RA-TLS and performs Intel TDX attestation verification locally
3

Verify the environment

The frontend verifies the TDX quote using DCAP QVL (Quote Verification Library) running entirely in your browser via WebAssembly
4

Process in the TEE

Your data is processed by vLLM inside the TEE, with all computation happening in encrypted memory
5

Receive encrypted response

Results stream back over the secure channel, with metrics collected by Prometheus
The attestation verification happens client-side in your browser. This means you don’t need to trust any server—you cryptographically verify the TEE yourself.

Architecture Overview

Umbra consists of three main components:
  1. Frontend - Next.js 16 application with React 19, TypeScript, Tailwind CSS, and shadcn/ui components
  2. CVM Services - Python/FastAPI microservices for attestation, authentication, and certificate management
  3. Monitoring Stack - Prometheus and Grafana for vLLM metrics and observability

Security Mechanisms

Intel TDX Attestation

DCAP verification with client-side quote validation

EKM Channel Binding

RFC 9266 compliant channel binding with HMAC-SHA256

Supabase RLS

Row-level security for database access control
Want to see the technical details? Check out the Architecture page for a deep dive into the system design and data flow.

Next Steps

Quickstart Guide

Get Umbra running locally in minutes

Architecture Deep Dive

Understand the system architecture and security model

Build docs developers (and LLMs) love