Project Overview
Timeline: 24 hours from first commit to deployed demoStack: Next.js 14, TypeScript, PostgreSQL, Anthropic Claude API
Deployment: Vercel with serverless functions
Source:
examples/decagon-assistant/
Key Features
- Real-time streaming responses via Claude API with token-by-token delivery
- Multi-turn context retention with sliding window + summarization
- Cross-session memory - assistant remembers user preferences and context
- Persistent conversation history stored in PostgreSQL
- Production-ready deployment on Vercel with proper error handling
Product Requirements
Product Statement
FromSPEC.md:
We are building a developer workflow copilot — a conversational AI assistant that feels humanlike and natural — for the Decagon AI challenge. The assistant is a full-stack Next.js application powered by the Anthropic Claude API, deployed publicly, featuring real-time streaming responses, persistent cross-session memory, and deep multi-turn context understanding.
Success Criteria (Ranked)
- Naturalness: Conversations feel indistinguishable from chatting with a sharp, friendly senior engineer
- Context retention: Assistant remembers everything within a session and key facts across sessions
- Utility: Genuinely helps developers with code review, debugging, architecture guidance
Hard Limits
- Time budget: 24 hours from first commit to deployed demo
- Resources: Vercel hobby tier, Neon/Supabase free tier Postgres
- External services: Anthropic Claude API only (no other paid APIs)
- Runtime mode: Deployed publicly on Vercel with HTTPS
Project Structure
Acceptance Tests
FromSPEC.md, the project defines objective, runnable acceptance criteria:
- Opening deployed URL shows chat interface within 2 seconds
- Sending “Hello, I’m working on a React project” then “Can you help debug useEffect?” produces response referencing React context without re-asking
- Refreshing page and asking “What were we talking about?” returns response referencing previous conversation
- Streaming tokens appear within 500ms (no full-response delay)
- 10-message conversation maintains context - 10th response references details from messages 1-3
- Sending code snippet with bug produces response identifying bug, explaining why, and providing correction
Specification Documents
SPEC.md - Complete Product Specification
Scope Model: Must Have (7):- Real-time streaming chat responses with incremental token delivery
- Multi-turn conversation with context window management
- Persistent conversation history in Postgres
- Cross-session user memory
- Dynamic system prompt with memory injection
- Polished chat UI with markdown and code highlighting
- Deployed to Vercel
- Conversation title auto-generation
- Code block copy button
- Keyboard shortcuts (Enter, Shift+Enter, Ctrl+N)
- Dark mode toggle
- Conversation search/filter
- User authentication / multi-user (single-user demo)
- Voice input/output
- File upload or image analysis
- Tool use / function calling
- Rate limiting (demo context)
- Analytics
- Mobile native app
AGENTS.md - Execution Policies
Key Conventions:- Chat endpoint uses
ReadableStreamandTransformStream - Response content type:
text/event-streamorapplication/octet-stream - Client uses
fetchwithresponse.body.getReader()(no polling/WebSocket) - Handle backpressure - client disconnect closes stream cleanly
- Each chunk must be valid parseable unit
- Messages stored in Postgres immediately (not batched)
- Sliding window: last 20 messages verbatim + summary of older messages
- Memory extraction runs after each assistant response (async)
- Memory retrieval before system prompt assembly
- Memory storage: simple key-value
{ userId, key, value, updatedAt }
Architecture Decisions
FromDECISIONS.md:
Direct Anthropic SDK over Vercel AI SDK
- Date: 2026-02-14
- Decision: Use
@anthropic-ai/sdkdirectly instead of Vercel AI SDK or LangChain - Why: Full control over streaming, system prompts, and context window management. Vercel AI SDK abstracts away details needed for natural conversation tuning
- Alternatives: Vercel AI SDK (simpler but less control), LangChain (too heavy), raw fetch (too low-level)
- Status: active
Next.js App Router over Pages Router
- Date: 2026-02-14
- Decision: Use Next.js 14 App Router with server components and route handlers
- Why: Route handlers provide native
ReadableStreamstreaming. Server components reduce bundle size. Current standard for modern Next.js - Alternatives: Pages Router (stable but older), Express + React SPA (more infra), Remix (less Vercel-native)
- Status: active
Postgres with Prisma over In-Memory
- Date: 2026-02-14
- Decision: PostgreSQL (Neon) with Prisma ORM for all persistence
- Why: Cross-session memory requires durability. localStorage doesn’t survive incognito. In-memory doesn’t survive serverless cold starts. Prisma provides type-safe database access
- Alternatives: SQLite (no Vercel serverless), Supabase (heavier), Redis (not primary storage), Vercel KV (limited queries)
- Status: active
SSE-Style Streaming over WebSockets
- Date: 2026-02-14
- Decision: Server-sent events pattern (ReadableStream) instead of WebSockets
- Why: Vercel serverless doesn’t support persistent WebSocket connections. SSE works with native fetch API, streams through Vercel edge cleanly. Request-response pattern doesn’t need full duplex
- Alternatives: WebSocket via socket.io (Vercel incompatible), polling (bad UX), Edge Runtime WebSocket (experimental)
- Status: active
Sliding Window + Summary for Context
- Date: 2026-02-14
- Decision: Keep last 20 messages verbatim, summarize older with Claude Haiku
- Why: Preserves conversational flow while managing context window. Haiku summarization is fast and cheap. 20 messages provides enough recency
- Alternatives: Full truncation (loses context), embedding retrieval (over-engineered), no summarization (loses early context)
- Status: active
Key-Value Memory over Vector Store
- Date: 2026-02-14
- Decision: Store memories as simple key-value pairs in Postgres
- Why: Memory needs are structured (name, tech stack, preferences). Can inject all memories into system prompt. Vector search is overkill
- Alternatives: Pinecone/Weaviate (unnecessary), JSON blob (harder to query), pgvector (adds extension dependency)
- Status: active
Setup and Execution
Prerequisites
Quick Start
Run Development Server
Verification
- Open http://localhost:3000, send message, verify streaming response within 500ms
- Send 3 messages in sequence, verify 3rd response references context from 1st
- Refresh page, send “What were we talking about?” - verify it recalls prior conversation
Deployment
Vercel Deployment
Connect to Vercel
- Go to Vercel dashboard
- Import repository
- Vercel auto-detects Next.js configuration
Verify
Open deployed URL and run the demo flow:
- Type “Hey, I’m working on a Next.js app with hydration errors”
- Watch real-time streaming response
- Follow up: “It happens when I use useEffect to fetch on mount”
- Verify assistant builds on prior context
- Send 3-4 more messages
- Refresh browser
- Ask “What were we working on?” - confirm memory retention
Operational Runbook
Monitoring
Key Logs:- Vercel function logs: Dashboard > Project > Functions > Logs
- Local dev: stdout from
npm run dev - Database queries: Set
DEBUG=prisma:queryin.env
- Anthropic API response latency (first token) - target < 500ms
- Database query latency - watch for connection pool exhaustion
- Vercel function duration - must stay under 10s (streaming keeps alive)
- Error rate on
/api/chat- any 500s indicate streaming/API issues
- “Something went wrong” error → Check Vercel logs for Anthropic API errors
- Messages load but no streaming → Check
Content-Typeheader andReadableStream - Empty conversation history → Database connection issue, verify
DATABASE_URL - Slow first message after idle → Vercel cold start + Prisma connection (expected 2-3s)
Recovery Procedures
Anthropic API Failure:Check Status
- Check Neon/Supabase dashboard for database status
- Verify
DATABASE_URLin Vercel environment variables - If connection pool exhausted: redeploy on Vercel to reset serverless instances
- If schema drift: run
npx prisma db pushagainst production database - Verify recovery: open URL, send message, confirm persistence after refresh
Key Learnings
What Worked Well
- Direct SDK control: Using
@anthropic-ai/sdkdirectly provided precise control over streaming and prompts without framework abstractions - Sliding window + summary: Balances context retention with token efficiency
- Prisma + Postgres: Type-safe database access that survives serverless cold starts
- SSE over WebSocket: Works seamlessly with Vercel serverless
Challenges Addressed
- Vercel serverless cold starts: Mitigated with Prisma connection pooling and streaming to keep functions alive
- Context window management: Solved with 20-message verbatim window + Haiku summarization of older messages
- Memory persistence: Key-value model simpler and more maintainable than vector store for structured preferences
Next Steps
Minecraft Browser Example
See how Longshot handles 3D game development
Project Structure
Learn the specification-driven approach
Swarm Execution
Understand multi-agent coordination
Basic Template
Start your own project from the template