Skip to main content

Overview

RAG Chat supports multiple OpenAI GPT models, each with different capabilities, speeds, and costs. Choose the model that best fits your requirements.

Available Models

The model selector is located in the sidebar:
app.py
model_options = [
    'gpt-3.5-turbo',
    'gpt-4',
    'gpt-4-turbo',
    'gpt-4o-mini',
    'gpt-4o',
]
selected_model = st.selectbox(
    label='Informe o llm que deseja:',
    options=model_options
)

Model Comparison

GPT-3.5 Turbo

Best for: Fast responses, simple queries, high volumeFast and cost-effective option for straightforward questions

GPT-4

Best for: Complex analysis, detailed reasoningMost capable model with superior reasoning abilities

GPT-4 Turbo

Best for: Balance of speed and capabilityFaster than GPT-4 with similar performance

GPT-4o

Best for: Latest features and optimizationsNewest optimized model with enhanced capabilities

GPT-4o Mini

Best for: Cost-conscious applicationsSmaller, faster, cheaper variant of GPT-4o

Detailed Model Characteristics

GPT-3.5 Turbo

  • Fastest response time
  • 💰 Lowest cost per request
  • 📊 Good for straightforward questions
  • 🔄 High throughput capability

GPT-4

  • 🧠 Superior reasoning
  • 🎯 Excellent accuracy
  • 📖 Better context understanding
  • 🔍 Complex query handling

GPT-4 Turbo

  • Faster than GPT-4
  • 🧠 Near GPT-4 quality
  • 💰 Lower cost than GPT-4
  • 🎯 Great balance

GPT-4o (Optimized)

  • 🚀 Latest model
  • Optimized performance
  • 🎯 Enhanced capabilities
  • 📊 Better efficiency

GPT-4o Mini

  • 💰 Very cost-effective
  • Fast responses
  • 🎯 Better than GPT-3.5
  • 📊 Good balance for most tasks

Speed vs. Accuracy Tradeoffs

There’s typically a tradeoff between speed and quality. Faster models are generally less expensive but may produce less nuanced responses.

Cost Considerations

Relative Cost Comparison

ModelRelative CostSpeedQuality
GPT-3.5 Turbo$⚡⚡⚡⚡⚡⭐⭐⭐
GPT-4o Mini$$⚡⚡⚡⚡⭐⭐⭐⭐
GPT-4 Turbo$$$⚡⚡⚡⭐⭐⭐⭐⭐
GPT-4$$$$⚡⚡⭐⭐⭐⭐⭐
GPT-4o$$$⚡⚡⚡⭐⭐⭐⭐⭐
Actual costs vary based on OpenAI’s current pricing. Check OpenAI’s pricing page for current rates.

Recommendation Guide

Choose GPT-3.5 Turbo If:

  • You need fast responses
  • You’re working with simple questions
  • Cost is a primary concern
  • You have high query volume

Choose GPT-4o Mini If:

  • You want better quality than GPT-3.5
  • Cost efficiency is important
  • You need good general-purpose performance
  • You’re in development/testing phase

Choose GPT-4 Turbo If:

  • You need production-grade quality
  • You want a balanced option
  • Your queries are moderately complex
  • This is the recommended default for most users

Choose GPT-4 If:

  • You need maximum accuracy
  • You’re analyzing complex technical content
  • Quality is more important than speed
  • You’re working with critical applications

Choose GPT-4o If:

  • You want the latest model
  • You need optimized performance
  • You want cutting-edge capabilities
  • Speed and quality are both important

Changing Models

You can change the selected model at any time using the sidebar dropdown:
app.py
response = ask_question(
    model = selected_model,  # Uses the currently selected model
    query = question,
    vector_store = vector_store
)
Try different models with the same question to compare responses and find the best fit for your use case.

How Models Are Used

The selected model is passed to ChatOpenAI:
app.py
def ask_question(model, query, vector_store):
    llm = ChatOpenAI(model = model)
    retriever = vector_store.as_retriever()
    # ... rest of the function
This creates an instance of the LLM with your selected model for each question.

Next Steps

Asking Questions

Learn how to ask effective questions

Configuration

Configure API keys and other settings

Build docs developers (and LLMs) love