Endpoint
Authentication
Requires a valid Clerk session. The endpoint extracts theuserId from the authentication context.
Request Body
The S3 file key for the uploaded PDF. This should be the key returned after uploading a file to S3.Example:
"uploads/user123/document-abc123.pdf"The display name for the PDF file. This will be shown to users in the chat interface.Example:
"Q4 Financial Report.pdf"Response
The unique identifier for the newly created chat session. Use this ID for subsequent chat operations.
Success Response (200)
Error Responses
Error message describing what went wrong
401 Unauthorized
Returned when the user is not authenticated:500 Internal Server Error
Returned when PDF processing or database operations fail:Example Request
Example Response
How It Works
- Authentication Check: Verifies the user is authenticated via Clerk
- PDF Processing: Downloads the PDF from S3 using the
file_key - Text Extraction: Extracts text content from the PDF
- Chunking: Splits the text into manageable chunks for embedding
- Embedding: Generates vector embeddings using OpenAI’s embedding model
- Vector Storage: Stores embeddings in Pinecone for semantic search
- Database Record: Creates a chat record in the database with:
fileKey: S3 file keypdfName: Display namepdfUrl: Public S3 URL for the PDFuserId: Authenticated user’s ID
- Response: Returns the newly created chat ID
The PDF processing and embedding generation may take 10-30 seconds depending on document size. Consider implementing a loading state or webhook for completion notifications.
Chat Record Schema
When a chat is created, the following data is stored:| Field | Type | Description |
|---|---|---|
id | integer | Unique chat identifier (auto-generated) |
pdfName | string | Display name of the PDF |
pdfUrl | string | S3 URL for accessing the PDF |
fileKey | string | S3 file key |
userId | string | Clerk user ID (max 255 chars) |
createdAt | timestamp | Chat creation timestamp |
Prerequisites
Before calling this endpoint:- User Authentication: Ensure the user is signed in via Clerk
- PDF Upload: Upload the PDF to S3 and obtain the
file_key - S3 Configuration: Verify S3 bucket permissions allow the API to read the file
- Pinecone Setup: Ensure Pinecone index is configured and accessible
Best Practices
- Validate the PDF file before uploading to S3
- Use descriptive
file_namevalues for better UX - Store the returned
chat_idin your application state - Implement error handling for failed PDF processing
- Consider file size limits (recommend max 50MB PDFs)
- Show upload progress and processing status to users
Related Operations
After creating a chat:- Use the returned
chat_idto send messages via/api/chat - Retrieve message history via
/api/get-messages - Display the PDF using the stored
pdfUrl