Skip to main content

Overview

Studley AI leverages Groq’s fast AI inference API to power all AI-driven features including:
  • Quiz generation
  • Flashcard creation
  • Study guide generation
  • AI tutor chat
  • Note-taking assistance
  • Audio transcription
This guide covers Groq setup, model configuration, and rate limiting.

Groq API Setup

1

Create Groq Account

  1. Visit console.groq.com
  2. Sign up with email or GitHub
  3. Verify your email address
2

Generate API Key

  1. Navigate to API Keys in the dashboard
  2. Click Create API Key
  3. Name your key (e.g., “Studley AI Production”)
  4. Copy the key immediately (it won’t be shown again)
3

Add to Environment Variables

Add your Groq API key to .env.local:
GROQ_API_KEY="gsk_xxxxxxxxxxxxxxxxxxxxx"
For production (Vercel), add to environment variables in dashboard.
4

Verify Setup

Test the connection:
test-groq.ts
import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

const completion = await groq.chat.completions.create({
  messages: [{ role: 'user', content: 'Hello!' }],
  model: 'llama-3.3-70b-versatile',
})

console.log(completion.choices[0].message.content)

Available AI Models

Studley AI uses different Groq models optimized for specific tasks:

Primary Models

llama-3.3-70b-versatile
string
Main model for most AI features.Best for:
  • Quiz generation
  • Study guide creation
  • General chat
Specs:
  • Context: 128k tokens
  • Speed: ~300 tokens/sec
  • Quality: High
llama-3.1-70b-versatile
string
Alternative versatile model for complex reasoning.Best for:
  • Complex explanations
  • Multi-step problems
  • Detailed analysis
Specs:
  • Context: 128k tokens
  • Speed: ~300 tokens/sec
mixtral-8x7b-32768
string
Fast model for simpler tasks.Best for:
  • Quick responses
  • Simple Q&A
  • Flashcard generation
Specs:
  • Context: 32k tokens
  • Speed: ~500 tokens/sec
whisper-large-v3
string
Audio transcription model.Best for:
  • Lecture transcription
  • Audio note processing
  • Voice input
Specs:
  • Languages: 100+
  • Quality: High accuracy

AI SDK Integration

Studley AI uses Vercel AI SDK for streamlined AI operations:
import { createGroq } from '@ai-sdk/groq'
import { generateText, streamText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

// Generate text (non-streaming)
const { text } = await generateText({
  model: groq('llama-3.3-70b-versatile'),
  prompt: 'Create a quiz about photosynthesis',
})

// Stream text (real-time)
const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: [
    { role: 'user', content: 'Explain quantum physics' },
  ],
})

for await (const chunk of textStream) {
  process.stdout.write(chunk)
}

Feature-Specific Configuration

Quiz Generation

app/api/generators/quiz/route.ts
import { createGroq } from '@ai-sdk/groq'
import { generateText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

export async function POST(request: Request) {
  const { topic, difficulty, count } = await request.json()
  
  const { text } = await generateText({
    model: groq('llama-3.3-70b-versatile'),
    prompt: `Generate ${count} ${difficulty} quiz questions about ${topic}`,
    temperature: 0.7,  // Creativity balance
    maxTokens: 2000,   // Limit output
  })
  
  return Response.json({ quiz: text })
}

Flashcard Generation

app/api/generators/flashcards/route.ts
const { text } = await generateText({
  model: groq('llama-3.3-70b-versatile'),
  prompt: `Create flashcards from: ${content}`,
  temperature: 0.5,  // More focused
  maxTokens: 1500,
})

AI Tutor Chat

app/api/ai-tutor/chat/route.ts
const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: conversationHistory,
  temperature: 0.8,  // Natural conversation
  maxTokens: 1000,
})

return new StreamingTextResponse(textStream)

Audio Transcription

app/api/transcribe-audio/route.ts
import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

const transcription = await groq.audio.transcriptions.create({
  file: audioFile,
  model: 'whisper-large-v3',
  language: 'en',
  response_format: 'json',
})

return Response.json({ text: transcription.text })

Rate Limiting

Studley AI implements rate limiting to prevent abuse and manage API costs.

Database-Based Rate Limiting

lib/rateLimit.ts
import { neon } from '@neondatabase/serverless'

const sql = neon(process.env.DATABASE_URL!)

export async function checkRateLimit(
  identifier: string,
  limit: number,
  windowSeconds: number
): Promise<boolean> {
  // Clean old entries
  await sql`
    DELETE FROM generation_rate_limits
    WHERE identifier = ${identifier}
    AND created_at < NOW() - INTERVAL '${windowSeconds} seconds'
  `
  
  // Count recent requests
  const [{ count }] = await sql`
    SELECT COUNT(*) as count
    FROM generation_rate_limits
    WHERE identifier = ${identifier}
  `
  
  if (count >= limit) {
    return false  // Rate limited
  }
  
  // Record new request
  await sql`
    INSERT INTO generation_rate_limits (identifier)
    VALUES (${identifier})
  `
  
  return true  // Allowed
}

Apply Rate Limits

import { checkRateLimit } from '@/lib/rateLimit'

export async function POST(request: Request) {
  const session = await getSession()
  
  // 10 requests per hour per user
  const allowed = await checkRateLimit(
    session.userId,
    10,
    3600
  )
  
  if (!allowed) {
    return Response.json(
      { error: 'Rate limit exceeded. Try again later.' },
      { status: 429 }
    )
  }
  
  // Proceed with AI generation...
}
Quiz Generation: 5 per hour
Flashcards: 10 per hour
AI Chat: 20 messages per hour
Study Guides: 3 per day
Quiz Generation: 50 per hour
Flashcards: 100 per hour
AI Chat: 200 messages per hour
Study Guides: Unlimited

Credit System

Studley AI uses a credit system to manage AI usage:

Credit Costs

const CREDIT_COSTS = {
  quiz: 100,           // 100 credits per quiz
  flashcards: 50,      // 50 credits per set
  study_guide: 150,    // 150 credits per guide
  ai_chat: 10,         // 10 credits per message
  transcription: 75,   // 75 credits per audio file
}

Check and Deduct Credits

app/actions.ts
import { sql } from '@/lib/auth/db-client'

export async function deductCredits(
  userId: string,
  amount: number,
  type: string
) {
  // Check current credits
  const [user] = await sql`
    SELECT credits FROM users WHERE id = ${userId}
  `
  
  if (user.credits < amount) {
    throw new Error('Insufficient credits')
  }
  
  // Deduct credits
  await sql`
    UPDATE users
    SET credits = credits - ${amount}
    WHERE id = ${userId}
  `
  
  // Log usage
  await sql`
    INSERT INTO credit_usage ("userId", amount, type)
    VALUES (${userId}, ${amount}, ${type})
  `
}

Error Handling

Implement robust error handling for AI operations:
import { createGroq } from '@ai-sdk/groq'
import { generateText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

try {
  const { text } = await generateText({
    model: groq('llama-3.3-70b-versatile'),
    prompt: userPrompt,
  })
  
  return Response.json({ result: text })
  
} catch (error) {
  console.error('AI generation failed:', error)
  
  // Check for specific errors
  if (error.message?.includes('rate limit')) {
    return Response.json(
      { error: 'Rate limit exceeded. Please try again later.' },
      { status: 429 }
    )
  }
  
  if (error.message?.includes('invalid API key')) {
    return Response.json(
      { error: 'AI service configuration error.' },
      { status: 500 }
    )
  }
  
  // Generic error
  return Response.json(
    { error: 'AI generation failed. Please try again.' },
    { status: 500 }
  )
}

Content Safety

Implement content moderation to ensure safe AI outputs:
lib/policyCheck.ts
import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

export async function checkContentPolicy(
  content: string
): Promise<boolean> {
  const response = await groq.chat.completions.create({
    messages: [
      {
        role: 'system',
        content: 'Analyze if content violates educational policies.',
      },
      {
        role: 'user',
        content: content,
      },
    ],
    model: 'llama-3.3-70b-versatile',
  })
  
  // Parse response and return true if safe
  return !response.choices[0].message.content.includes('violation')
}

Performance Optimization

Use streaming for better UX:
const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: messages,
})

return new StreamingTextResponse(textStream)
Users see responses as they’re generated, reducing perceived latency.
Cache common system prompts:
const systemPrompts = {
  quiz: 'You are an expert quiz generator...',
  flashcards: 'You create effective flashcards...',
}

// Reuse prompts across requests
Limit tokens to reduce costs and latency:
maxTokens: 1000,  // Shorter responses
temperature: 0.7, // Balance creativity/consistency

Monitoring and Logging

Track AI usage for optimization:
// Log AI requests
await sql`
  INSERT INTO generations ("userId", type, topic, content)
  VALUES (${userId}, ${type}, ${topic}, ${result})
`

// Track token usage
console.log('Tokens used:', completion.usage)

// Monitor costs
const cost = (completion.usage.total_tokens / 1000000) * 0.59  // Groq pricing
console.log('Estimated cost:', cost)

Troubleshooting

Error: GROQ_API_KEY is undefinedSolutions:
  • Verify GROQ_API_KEY is in .env.local
  • Check environment variable is set in Vercel dashboard
  • Restart dev server after adding variable
  • Ensure no typos in variable name
Error: Rate limit exceededSolutions:
  • Check Groq dashboard for rate limits
  • Implement request queuing
  • Add user-facing rate limiting
  • Consider upgrading Groq plan
Solutions:
  • Use streaming responses
  • Reduce maxTokens
  • Switch to faster model (Mixtral)
  • Optimize prompts for conciseness
Solutions:
  • Adjust temperature (0.7-0.9 for creativity)
  • Improve prompt engineering
  • Use examples in prompts (few-shot)
  • Switch to larger model (70B)

Groq Pricing

As of 2024, Groq offers competitive pricing:
Limits:
  • 14,400 requests/day
  • 7,200,000 tokens/day
  • Rate: 30 requests/minute
Best for: Development, testing, small projects

Next Steps

File Storage

Set up file uploads for documents

API Reference

View AI generation endpoints

Quiz Features

Learn about quiz generation

Credit System

Manage credits and usage

Build docs developers (and LLMs) love