The AI Chat feature enables real-time conversations with your PDF documents using OpenAI’s GPT-4 model. Messages stream in real-time, and the AI provides context-aware answers by retrieving relevant sections from your document using Pinecone vector search.
The AI retrieves relevant context from your PDF using vector similarity search:
src/lib/context.ts
export async function getContext(query: string, fileKey: string) { const queryEmbeddings = await getEmbeddings(query); const matches = await getMatchesFromEmbeddings(queryEmbeddings, fileKey); const qualifyingDocs = matches.filter( (match) => match.score && match.score > 0.7 ); type Metadata = { text: string; pageNumber: number; }; let docs = qualifyingDocs.map((match) => (match.metadata as Metadata).text); // Return top 5 vectors, limited to 3000 characters return docs.join("\n").substring(0, 3000);}
The system retrieves the top 5 most similar document chunks and filters results with a similarity score above 0.7 to ensure relevance. The combined context is limited to 3,000 characters to fit within GPT-4’s context window efficiently.
The chat API uses OpenAI’s streaming capabilities via Vercel Edge Runtime:
src/app/api/chat/route.ts
export const runtime = "edge";const config = new Configuration({ apiKey: process.env.OPEN_AI_KEY,});const openai = new OpenAIApi(config);export async function POST(req: NextRequest) { try { const { messages, chatId } = await req.json(); const _chats = await db.select().from(chats).where(eq(chats.id, chatId)); if (_chats.length != 1) { return NextResponse.json({ error: "chat not found" }, { status: 404 }); } const fileKey = _chats[0].fileKey; const lastMessage = messages[messages.length - 1]; const context = await getContext(lastMessage.content, fileKey); const prompt = { role: "system", content: `AI assistant is a brand new, powerful, human-like artificial intelligence. The traits of AI include expert knowledge, helpfulness, cleverness, and articulateness. AI is a well-behaved and well-mannered individual. AI is always friendly, kind, and inspiring, and he is eager to provide vivid and thoughtful responses to the user. AI has the sum of all knowledge in their brain, and is able to accurately answer nearly any question about any topic in conversation. AI assistant is a big fan of Pinecone and Vercel. START CONTEXT BLOCK ${context} END OF CONTEXT BLOCK AI assistant will take into account any CONTEXT BLOCK that is provided in a conversation. If the context does not provide the answer to question, the AI assistant will say, "I'm sorry, but I don't know the answer to that question". AI assistant will not apologize for previous responses, but instead will indicated new information was gained. AI assistant will not invent anything that is not drawn directly from the context. `, }; const response = await openai.createChatCompletion({ model: "gpt-4-1106-preview", messages: [ prompt, ...messages.filter((message: Message) => message.role === "user"), ], stream: true, }); const stream = OpenAIStream(response, { onStart: async () => { // Save user's message to database await db.insert(dbMessages).values({ chatId, content: lastMessage.content, role: "user", }); }, onCompletion: async (completion) => { // Save AI's response to database await db.insert(dbMessages).values({ chatId, content: completion, role: "system", }); }, }); return new StreamingTextResponse(stream); } catch (error) { return NextResponse.json( { error: "internal server error" }, { status: 500 } ); }}
The API runs on Vercel’s Edge Runtime for ultra-low latency responses. Messages are saved to the database both when the user sends them (onStart) and when the AI completes the response (onCompletion).
The AI is configured with specific behavioral guidelines:
Context-bound responses: Only answers based on the provided document context
Honest limitations: States when information isn’t available in the context
No hallucination: Never invents information not present in the document
Incremental learning: Acknowledges new information rather than apologizing
If your question cannot be answered from the PDF content, the AI will respond with “I’m sorry, but I don’t know the answer to that question” rather than making up information.
The AI performs best with specific, focused questions. Instead of “What is this document about?”, try “What are the key findings in section 3?” or “How does the author define X?”
Reference page numbers
Since the context includes page numbers in metadata, you can ask questions like “What does page 5 say about…?” for more targeted responses.
Follow-up questions
The chat maintains conversation history, so you can ask follow-up questions that reference previous answers.