Skip to main content
This quickstart guide will help you go from installation to asking your first question about a PDF document in just a few steps.

Prerequisites

Before you begin, make sure you have:
  • Python 3.8 or higher installed
  • An OpenAI API key (get one here)
  • A PDF document to test with

Quick Setup

1

Clone and navigate to the project

git clone <your-repo-url>
cd rag-chat
2

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
3

Install dependencies

pip install -r requirements.txt
This installs the core dependencies:
  • streamlit - Web interface
  • langchain - RAG framework
  • langchain-chroma - Vector database
  • langchain-openai - OpenAI integration
  • pypdf - PDF processing
4

Configure your OpenAI API key

Create a .env file from the example:
cp .env.example .env
Then edit .env and add your API key:
OPENAI_API_KEY='sk-your-actual-api-key-here'
Never commit your .env file to version control. The .gitignore file is already configured to exclude it.
5

Launch the application

streamlit run app.py
Your browser will automatically open to http://localhost:8501

Your First Query

Once the application is running, you can immediately start asking questions about your documents.
1

Upload a PDF document

  1. In the sidebar, click the file uploader under “Upload de arquivos”
  2. Select one or more PDF files from your computer
  3. Wait for the processing to complete (you’ll see a “Carregando arquivos…” spinner)
The application will:
  • Extract text from your PDF using PyPDFLoader
  • Split it into chunks (1000 characters with 400 character overlap)
  • Create embeddings using OpenAI’s embedding model
  • Store them in a local ChromaDB vector database
2

Select your AI model

Choose from the available models in the sidebar:
  • gpt-3.5-turbo - Fast and economical
  • gpt-4o-mini - Balanced performance
  • gpt-4o - Most capable
  • gpt-4-turbo or gpt-4 - Previous generation
3

Ask a question

Type your question in the chat input at the bottom of the page:
What are the main topics covered in this document?
The RAG system will:
  1. Retrieve relevant chunks from the vector store
  2. Send them as context to the LLM
  3. Generate a response based on your documents

Expected Output

When you ask a question, you’ll see:
  1. Your question displayed in the chat interface
  2. A “Buscando resposta…” spinner while processing
  3. An AI response formatted in markdown with visualizations
The AI will only answer based on the content in your uploaded documents. If the answer isn’t found, it will explain that no information is available.
The vector database persists in a local db/ directory. Documents you upload remain available even after restarting the application.

How It Works

Under the hood, RAG Chat uses a retrieval-augmented generation pipeline:
# From app.py:55-82
def ask_question(model, query, vector_store):
    llm = ChatOpenAI(model = model)
    retriever = vector_store.as_retriever()

    system_prompt = '''
    Use o contexto para responder as perguntas.
    Se não encontrar uma resposta no contexto,
    explique que não há informações disponíveis.
    Responda em formato de markdown e com visualizações
    elaboradas e interativas.
    Contexto: {context}
    '''

    messages = [('system', system_prompt)]
    for message in st.session_state.messages:
        messages.append((message.get('role'), message.get('content')))
    
    prompt = ChatPromptTemplate.from_messages(messages)
    chain = (
        {
            'context': retriever,
            'input': RunnablePassthrough()
        }
        | prompt
        | llm
    )
    response = chain.invoke(query)
    return response.content
This chain:
  1. Retrieves relevant document chunks using similarity search
  2. Injects them into the prompt as context
  3. Maintains conversation history for follow-up questions
  4. Returns markdown-formatted responses

Next Steps

Installation

Learn about advanced installation options and troubleshooting

RAG Overview

Understand how the RAG pipeline works in detail

Configuration

Customize chunking, embeddings, and model settings

API Reference

Explore the core functions and their parameters

Build docs developers (and LLMs) love