System Architecture
PDF AI is built as a full-stack Next.js application that leverages AI and vector databases to enable intelligent conversations with PDF documents. The architecture follows a serverless edge-first approach for optimal performance and scalability.High-Level Components
The system consists of four main architectural layers:- Frontend Layer - Next.js 13+ with React Server Components
- API Layer - Edge Runtime API routes for low-latency responses
- Data Layer - PostgreSQL (via Drizzle ORM) for structured data
- AI/ML Layer - OpenAI embeddings and GPT-4 for semantic understanding
The application runs on Vercel’s Edge Runtime, ensuring global distribution and sub-100ms response times for most operations.
Tech Stack
Core Framework
- Next.js 13+ - React framework with App Router
- TypeScript - Type-safe development
- Edge Runtime - Deployed on Vercel Edge Network
AI & Vector Database
- OpenAI API -
text-embedding-ada-002for embeddings,gpt-4-1106-previewfor chat - Pinecone - Serverless vector database for semantic search
- LangChain - Document loading and text splitting utilities
Data & Storage
- PostgreSQL - Relational database for chats, messages, and user data
- Drizzle ORM - Type-safe database queries
- AWS S3 - Object storage for PDF files
Authentication & Payments
- Clerk - User authentication and management
- Stripe - Subscription billing
Data Flow
Document Upload Flow
When a user uploads a PDF document, the following sequence occurs:Step-by-step document processing
Step-by-step document processing
- Upload to S3: PDF file is uploaded to AWS S3 bucket
- Download: Server downloads PDF from S3 (src/lib/s3-server.ts:3)
- Load PDF: LangChain PDFLoader extracts text from all pages
- Split Documents: Text is chunked using RecursiveCharacterTextSplitter
- Generate Embeddings: Each chunk is converted to a 1536-dimension vector
- Store in Pinecone: Vectors are upserted with metadata (page number, text)
- Create Chat: Database record created linking user to document
Query Flow
When a user asks a question about their document:Database Schema
The application uses three primary tables:Performance Optimizations
The system is optimized for both cold starts and sustained performance.
Edge Runtime
All API routes useexport const runtime = "edge" to run on Vercel’s Edge Network, providing:
- Sub-100ms cold starts
- Global distribution
- Automatic scaling
- Lower costs compared to serverless functions
Streaming Responses
The chat API streams responses using the Vercel AI SDK:Pinecone Client Singleton
The Pinecone client is initialized once and reused across requests:Security Considerations
Authentication
Clerk middleware protects all authenticated routes:Data Isolation
Pinecone namespaces ensure users can only access their own documents:Environment Variables
Sensitive credentials are stored as environment variables:PINECONE_API_KEY- Pinecone authenticationOPEN_AI_KEY- OpenAI API accessNEXT_PUBLIC_S3_ACCESS_KEY_ID- AWS S3 credentialsNEXT_PUBLIC_S3_SECRET_ACCESS_KEY- AWS S3 secret- Database connection strings
Scalability
The architecture supports horizontal scaling:- Stateless API routes - No server-side session state
- Serverless vector database - Pinecone auto-scales
- Edge distribution - Globally distributed compute
- Object storage - S3 handles unlimited PDFs
The system can handle thousands of concurrent users and millions of documents with no architectural changes required.