Skip to main content
The RAG Chat feature provides a persistent, session-based chat interface similar to ChatGPT, where users can have multi-turn conversations with AI that’s grounded in uploaded knowledge base documents. Unlike the stateless knowledge base query API, RAG Chat maintains conversation history and context across messages.

Overview

RAG Chat sessions allow users to:
  • Create persistent chat sessions with custom titles
  • Have multi-turn conversations with full message history
  • Pin important sessions for quick access
  • Switch between different knowledge bases mid-conversation
  • Stream AI responses in real-time with SSE
  • Manage and organize chat sessions

Persistent Context

Full conversation history maintained across sessions

Knowledge Base Integration

Query multiple knowledge bases with vector search

Real-time Streaming

Server-Sent Events for typewriter-style responses

Session Management

Organize chats with titles, pinning, and deletion

Session Lifecycle

Creating a Session

Create a new RAG Chat session with initial knowledge bases:
interface CreateSessionRequest {
  knowledgeBaseIds: number[];  // Knowledge bases to query
  title?: string;               // Optional custom title
}

// Example
const response = await fetch('http://localhost:8080/api/rag-chat/sessions', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    knowledgeBaseIds: [1, 2],
    title: 'Java Spring Boot Questions'
  })
});

const { data } = await response.json();
// data: { sessionId: 123, title: 'Java Spring Boot Questions', ... }

Sending Messages

Send a message and receive streaming AI response:
const sessionId = 123;
const question = "How does Spring AI handle embeddings?";

// Create EventSource for SSE
const eventSource = new EventSource(
  `http://localhost:8080/api/rag-chat/sessions/${sessionId}/messages/stream?` +
  new URLSearchParams({ question })
);

let fullResponse = '';

eventSource.onmessage = (event) => {
  const chunk = event.data.replace(/\\n/g, '\n').replace(/\\r/g, '\r');
  fullResponse += chunk;
  console.log('Chunk:', chunk);
};

eventSource.onerror = (error) => {
  console.error('Stream error:', error);
  eventSource.close();
};

// EventSource doesn't emit a 'complete' event, so track readyState
const checkComplete = setInterval(() => {
  if (eventSource.readyState === EventSource.CLOSED) {
    clearInterval(checkComplete);
    console.log('Full response:', fullResponse);
  }
}, 100);
The streaming endpoint uses Server-Sent Events (SSE) with newline escaping (\n\\n). Make sure to unescape when displaying.

Managing Sessions

List All Sessions

curl 'http://localhost:8080/api/rag-chat/sessions'
{
  "code": 200,
  "message": "success",
  "data": [
    {
      "sessionId": 123,
      "title": "Java Spring Boot Questions",
      "isPinned": true,
      "messageCount": 15,
      "lastMessageAt": "2024-03-10T10:30:00",
      "createdAt": "2024-03-10T09:00:00"
    }
  ]
}

Get Session Details

Retrieve full conversation history:
curl 'http://localhost:8080/api/rag-chat/sessions/123'
{
  "code": 200,
  "data": {
    "sessionId": 123,
    "title": "Java Spring Boot Questions",
    "isPinned": true,
    "knowledgeBaseIds": [1, 2],
    "messages": [
      {
        "messageId": 1,
        "role": "user",
        "content": "How does Spring AI work?",
        "createdAt": "2024-03-10T09:05:00"
      },
      {
        "messageId": 2,
        "role": "assistant",
        "content": "Spring AI provides...",
        "createdAt": "2024-03-10T09:05:15"
      }
    ]
  }
}

Update Session Title

curl -X PUT 'http://localhost:8080/api/rag-chat/sessions/123/title' \
  -H 'Content-Type: application/json' \
  -d '{"title":"Spring AI Deep Dive"}'

Pin/Unpin Session

Toggle pin status for quick access:
curl -X PUT 'http://localhost:8080/api/rag-chat/sessions/123/pin'

Switch Knowledge Bases

Change which knowledge bases the session queries:
curl -X PUT 'http://localhost:8080/api/rag-chat/sessions/123/knowledge-bases' \
  -H 'Content-Type: application/json' \
  -d '{"knowledgeBaseIds":[3,4,5]}'

Delete Session

curl -X DELETE 'http://localhost:8080/api/rag-chat/sessions/123'
Deleting a session permanently removes all messages and conversation history.

Streaming Implementation Details

  1. Request: Client sends POST to /messages/stream with question
  2. Prepare: Backend saves user message and creates AI message placeholder
  3. Stream: AI response chunks streamed as SSE events with escaped newlines
  4. Complete: After stream ends, full response saved to database
  5. Error Handling: If stream fails, partial content still saved
SSE Format:
data: This is a chunk\\nwith escaped newlines

data: Next chunk here

The backend escapes \n as \\n and \r as \\r to preserve SSE protocol.
  • User messages saved immediately when request received
  • AI messages created as placeholders before streaming starts
  • Full AI response updated after stream completes
  • Partial responses saved even if stream errors
  • Message IDs returned in session detail endpoint

Use Cases

Research & Learning

Students exploring documentation with follow-up questions and clarifications

Technical Support

Engineers querying internal knowledge bases with conversation context

Documentation Q&A

Developers asking questions about codebases with multi-turn debugging

Interview Prep

Job seekers practicing with knowledge base of common interview questions

Comparison with Knowledge Base Query

FeatureRAG Chat SessionsKnowledge Base Query
StateStateful (persistent history)Stateless (one-off query)
ContextMulti-turn conversationSingle question
HistoryFull message history savedNo history
Use CaseInteractive explorationQuick lookups
API/api/rag-chat/*/api/knowledgebase/query
Use RAG Chat for interactive, exploratory conversations. Use Knowledge Base Query for quick, one-off lookups.

See Also

Knowledge Base

Upload and manage knowledge base documents

RAG Chat API

Complete API reference for RAG Chat endpoints

Build docs developers (and LLMs) love