UploadThing (File Storage)
UploadThing handles PDF file uploads and storage for Uxie.Setup
- Go to UploadThing Dashboard
- Sign in and create a new app
- Copy your API token from the dashboard
- Add to
.env:
Configuration
Uxie’s UploadThing configuration insrc/server/uploadthing.ts:
Maximum PDF file size per upload
Number of files per upload request
Only PDF files are accepted
Plan Limits
Upload limits are enforced based on user’s subscription plan:Upload Flow
- Middleware: Checks authentication and plan limits
- Upload: File is uploaded to UploadThing servers
- Processing: PDF is loaded and page count extracted
- Cover Generation: First page is converted to cover image
- Database: Document record is created in Prisma
UploadThing automatically handles CDN distribution and provides optimized file delivery.
Pinecone (Vector Database)
Pinecone stores document embeddings for semantic search and chat functionality.Setup
- Go to Pinecone Console
- Sign up and create a new index
-
Index configuration:
- Name:
uxie - Dimensions:
768(for BAAI/bge-base-en-v1.5 embeddings) - Metric:
cosine - Environment: Choose your region (e.g.,
us-east-1-aws)
- Name:
- Copy your API key
-
Add to
.env:
Integration
Uxie uses Pinecone through the LangChain integration:Document Vectorization
When a document is uploaded:- PDF is split into chunks using
RecursiveCharacterTextSplitter - Text chunks are embedded using Hugging Face embeddings
- Vectors are stored in Pinecone with document metadata
- Document is marked as
isVectorised: truein database
Metadata Structure
Hugging Face (Embeddings)
Hugging Face provides text embeddings for document vectorization.Setup
- Go to Hugging Face
- Sign up and go to Settings > Tokens
- Create a new token with read access
- Add to
.env:
Embedding Model
Uxie uses the BAAI/bge-base-en-v1.5 model:- Dimensions: 768
- Max sequence length: 512 tokens
- Performance: High-quality embeddings for semantic search
- Use case: Document vectorization and similarity search
Implementation
Hugging Face Inference API has rate limits on the free tier. For production use, consider upgrading or hosting your own embedding model.
Google Gemini (AI Models)
Google’s Gemini models power chat, summarization, and flashcard generation.Setup
- Go to Google AI Studio
- Create a new API key
- Add to
.env:
Models Used
Gemini 2.5 Flash
Gemini 2.5 Flash
Used for:
- Document chat (
/api/chat/route.ts) - Document summarization (
/lib/summarize.ts) - Flashcard generation (
/lib/flashcard.ts) - Text completion (
/api/completion/route.ts)
- Fast inference
- Cost-effective
- Good for real-time interactions
Gemini 1.5 Pro
Gemini 1.5 Pro
Used for:
- Flashcard answer evaluation (
/api/evaluate/route.ts)
- Higher reasoning capability
- Better for complex analysis
- More accurate evaluation
Usage Example
Gemini API has usage quotas. Monitor your usage in Google AI Studio and upgrade if needed.
Liveblocks (Real-time Collaboration)
Liveblocks enables real-time collaboration features for document annotations and sharing.Setup
- Go to Liveblocks Dashboard
- Create a new project
- Copy your public API key (not secret key)
- Add to
.env:
Configuration
Liveblocks is configured inliveblocks.config.ts:
Features Enabled
- Presence: See who else is viewing the document
- Live cursors: Track user positions in real-time
- Collaboration: Multiple users editing annotations
- Comments: Real-time commenting on documents
Usage
Liveblocks automatically handles connection management, reconnection, and conflict resolution.
Service Comparison
| Service | Purpose | Pricing Tier | Rate Limits |
|---|---|---|---|
| UploadThing | File storage | Free: 2GB storage | 100 uploads/month (free) |
| Pinecone | Vector database | Free: 1 index | 100K vectors (free) |
| Hugging Face | Embeddings | Free API | Rate limited |
| Google Gemini | AI models | Pay per token | Generous free quota |
| Liveblocks | Real-time sync | Free: 100 MAU | Unlimited rooms (free) |
MAU = Monthly Active Users. Liveblocks counts users who connect to a room in a given month.
Cost Optimization Tips
UploadThing
- Compress PDFs before upload when possible
- Delete unused documents to free up storage
- Consider upgrading for production use
Pinecone
- Use namespace filtering to reduce vector count
- Delete old document vectors when documents are removed
- Monitor index size in Pinecone dashboard
Hugging Face
- Cache embeddings for frequently accessed documents
- Batch embedding requests when possible
- Consider self-hosting for high-volume use
Google Gemini
- Use Gemini 2.5 Flash for most operations (cheaper)
- Reserve Gemini 1.5 Pro for complex tasks
- Implement token limits on user inputs
- Cache common responses when appropriate
Liveblocks
- Clean up inactive rooms periodically
- Use presence awareness to optimize updates
- Monitor MAU usage in dashboard
Environment Variables Summary
.env
