AI Configuration - Studley AI

Overview

Studley AI leverages Groq’s fast AI inference API to power all AI-driven features including:

Quiz generation
Flashcard creation
Study guide generation
AI tutor chat
Note-taking assistance
Audio transcription

This guide covers Groq setup, model configuration, and rate limiting.

Groq API Setup

Create Groq Account

Visit console.groq.com
Sign up with email or GitHub
Verify your email address

Generate API Key

Navigate to API Keys in the dashboard
Click Create API Key
Name your key (e.g., “Studley AI Production”)
Copy the key immediately (it won’t be shown again)

Add to Environment Variables

Add your Groq API key to .env.local:

GROQ_API_KEY="gsk_xxxxxxxxxxxxxxxxxxxxx"

For production (Vercel), add to environment variables in dashboard.

Verify Setup

Test the connection:

test-groq.ts

import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

const completion = await groq.chat.completions.create({
  messages: [{ role: 'user', content: 'Hello!' }],
  model: 'llama-3.3-70b-versatile',
})

console.log(completion.choices[0].message.content)

Available AI Models

Studley AI uses different Groq models optimized for specific tasks:

Primary Models

llama-3.3-70b-versatile

string

Main model for most AI features.Best for:

Quiz generation
Study guide creation
General chat

Specs:

Context: 128k tokens
Speed: ~300 tokens/sec
Quality: High

llama-3.1-70b-versatile

string

Alternative versatile model for complex reasoning.Best for:

Complex explanations
Multi-step problems
Detailed analysis

Specs:

Context: 128k tokens
Speed: ~300 tokens/sec

mixtral-8x7b-32768

string

Fast model for simpler tasks.Best for:

Quick responses
Simple Q&A
Flashcard generation

Specs:

Context: 32k tokens
Speed: ~500 tokens/sec

whisper-large-v3

string

Audio transcription model.Best for:

Lecture transcription
Audio note processing
Voice input

Specs:

Languages: 100+
Quality: High accuracy

AI SDK Integration

Studley AI uses Vercel AI SDK for streamlined AI operations:

import { createGroq } from '@ai-sdk/groq'
import { generateText, streamText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

// Generate text (non-streaming)
const { text } = await generateText({
  model: groq('llama-3.3-70b-versatile'),
  prompt: 'Create a quiz about photosynthesis',
})

// Stream text (real-time)
const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: [
    { role: 'user', content: 'Explain quantum physics' },
  ],
})

for await (const chunk of textStream) {
  process.stdout.write(chunk)
}

Feature-Specific Configuration

Quiz Generation

app/api/generators/quiz/route.ts

import { createGroq } from '@ai-sdk/groq'
import { generateText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

export async function POST(request: Request) {
  const { topic, difficulty, count } = await request.json()
  
  const { text } = await generateText({
    model: groq('llama-3.3-70b-versatile'),
    prompt: `Generate ${count} ${difficulty} quiz questions about ${topic}`,
    temperature: 0.7,  // Creativity balance
    maxTokens: 2000,   // Limit output
  })
  
  return Response.json({ quiz: text })
}

Flashcard Generation

app/api/generators/flashcards/route.ts

const { text } = await generateText({
  model: groq('llama-3.3-70b-versatile'),
  prompt: `Create flashcards from: ${content}`,
  temperature: 0.5,  // More focused
  maxTokens: 1500,
})

AI Tutor Chat

app/api/ai-tutor/chat/route.ts

const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: conversationHistory,
  temperature: 0.8,  // Natural conversation
  maxTokens: 1000,
})

return new StreamingTextResponse(textStream)

Audio Transcription

app/api/transcribe-audio/route.ts

import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

const transcription = await groq.audio.transcriptions.create({
  file: audioFile,
  model: 'whisper-large-v3',
  language: 'en',
  response_format: 'json',
})

return Response.json({ text: transcription.text })

Rate Limiting

Studley AI implements rate limiting to prevent abuse and manage API costs.

Database-Based Rate Limiting

lib/rateLimit.ts

import { neon } from '@neondatabase/serverless'

const sql = neon(process.env.DATABASE_URL!)

export async function checkRateLimit(
  identifier: string,
  limit: number,
  windowSeconds: number
): Promise<boolean> {
  // Clean old entries
  await sql`
    DELETE FROM generation_rate_limits
    WHERE identifier = ${identifier}
    AND created_at < NOW() - INTERVAL '${windowSeconds} seconds'
  `
  
  // Count recent requests
  const [{ count }] = await sql`
    SELECT COUNT(*) as count
    FROM generation_rate_limits
    WHERE identifier = ${identifier}
  `
  
  if (count >= limit) {
    return false  // Rate limited
  }
  
  // Record new request
  await sql`
    INSERT INTO generation_rate_limits (identifier)
    VALUES (${identifier})
  `
  
  return true  // Allowed
}

Apply Rate Limits

import { checkRateLimit } from '@/lib/rateLimit'

export async function POST(request: Request) {
  const session = await getSession()
  
  // 10 requests per hour per user
  const allowed = await checkRateLimit(
    session.userId,
    10,
    3600
  )
  
  if (!allowed) {
    return Response.json(
      { error: 'Rate limit exceeded. Try again later.' },
      { status: 429 }
    )
  }
  
  // Proceed with AI generation...
}

Recommended Rate Limits

Free Tier Users

Quiz Generation: 5 per hour
Flashcards: 10 per hour
AI Chat: 20 messages per hour
Study Guides: 3 per day

Premium Users

Quiz Generation: 50 per hour
Flashcards: 100 per hour
AI Chat: 200 messages per hour
Study Guides: Unlimited

Credit System

Studley AI uses a credit system to manage AI usage:

Credit Costs

const CREDIT_COSTS = {
  quiz: 100,           // 100 credits per quiz
  flashcards: 50,      // 50 credits per set
  study_guide: 150,    // 150 credits per guide
  ai_chat: 10,         // 10 credits per message
  transcription: 75,   // 75 credits per audio file
}

Check and Deduct Credits

app/actions.ts

import { sql } from '@/lib/auth/db-client'

export async function deductCredits(
  userId: string,
  amount: number,
  type: string
) {
  // Check current credits
  const [user] = await sql`
    SELECT credits FROM users WHERE id = ${userId}
  `
  
  if (user.credits < amount) {
    throw new Error('Insufficient credits')
  }
  
  // Deduct credits
  await sql`
    UPDATE users
    SET credits = credits - ${amount}
    WHERE id = ${userId}
  `
  
  // Log usage
  await sql`
    INSERT INTO credit_usage ("userId", amount, type)
    VALUES (${userId}, ${amount}, ${type})
  `
}

Error Handling

Implement robust error handling for AI operations:

import { createGroq } from '@ai-sdk/groq'
import { generateText } from 'ai'

const groq = createGroq({
  apiKey: process.env.GROQ_API_KEY,
})

try {
  const { text } = await generateText({
    model: groq('llama-3.3-70b-versatile'),
    prompt: userPrompt,
  })
  
  return Response.json({ result: text })
  
} catch (error) {
  console.error('AI generation failed:', error)
  
  // Check for specific errors
  if (error.message?.includes('rate limit')) {
    return Response.json(
      { error: 'Rate limit exceeded. Please try again later.' },
      { status: 429 }
    )
  }
  
  if (error.message?.includes('invalid API key')) {
    return Response.json(
      { error: 'AI service configuration error.' },
      { status: 500 }
    )
  }
  
  // Generic error
  return Response.json(
    { error: 'AI generation failed. Please try again.' },
    { status: 500 }
  )
}

Content Safety

Implement content moderation to ensure safe AI outputs:

lib/policyCheck.ts

import Groq from 'groq-sdk'

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
})

export async function checkContentPolicy(
  content: string
): Promise<boolean> {
  const response = await groq.chat.completions.create({
    messages: [
      {
        role: 'system',
        content: 'Analyze if content violates educational policies.',
      },
      {
        role: 'user',
        content: content,
      },
    ],
    model: 'llama-3.3-70b-versatile',
  })
  
  // Parse response and return true if safe
  return !response.choices[0].message.content.includes('violation')
}

Performance Optimization

Response Streaming

Use streaming for better UX:

const { textStream } = await streamText({
  model: groq('llama-3.3-70b-versatile'),
  messages: messages,
})

return new StreamingTextResponse(textStream)

Users see responses as they’re generated, reducing perceived latency.

Prompt Caching

Cache common system prompts:

const systemPrompts = {
  quiz: 'You are an expert quiz generator...',
  flashcards: 'You create effective flashcards...',
}

// Reuse prompts across requests

Token Optimization

Limit tokens to reduce costs and latency:

maxTokens: 1000,  // Shorter responses
temperature: 0.7, // Balance creativity/consistency

Monitoring and Logging

Track AI usage for optimization:

// Log AI requests
await sql`
  INSERT INTO generations ("userId", type, topic, content)
  VALUES (${userId}, ${type}, ${topic}, ${result})
`

// Track token usage
console.log('Tokens used:', completion.usage)

// Monitor costs
const cost = (completion.usage.total_tokens / 1000000) * 0.59  // Groq pricing
console.log('Estimated cost:', cost)

Troubleshooting

API key not found

Error: GROQ_API_KEY is undefinedSolutions:

Verify GROQ_API_KEY is in .env.local
Check environment variable is set in Vercel dashboard
Restart dev server after adding variable
Ensure no typos in variable name

Rate limit errors

Error: Rate limit exceededSolutions:

Check Groq dashboard for rate limits
Implement request queuing
Add user-facing rate limiting
Consider upgrading Groq plan

Slow response times

Solutions:

Use streaming responses
Reduce maxTokens
Switch to faster model (Mixtral)
Optimize prompts for conciseness

Poor output quality

Solutions:

Adjust temperature (0.7-0.9 for creativity)
Improve prompt engineering
Use examples in prompts (few-shot)
Switch to larger model (70B)

Groq Pricing

As of 2024, Groq offers competitive pricing:

Free Tier
Pay-as-you-go

Limits:

14,400 requests/day
7,200,000 tokens/day
Rate: 30 requests/minute

Best for: Development, testing, small projects

Next Steps

File Storage

Set up file uploads for documents

API Reference

View AI generation endpoints

Quiz Features

Learn about quiz generation

Credit System

Manage credits and usage

Setup

Configuration

​Overview

​Groq API Setup

​Available AI Models

​Primary Models

​AI SDK Integration

​Feature-Specific Configuration

​Quiz Generation

​Flashcard Generation

​AI Tutor Chat

​Audio Transcription

​Rate Limiting

​Database-Based Rate Limiting

​Apply Rate Limits

​Recommended Rate Limits

​Credit System

​Credit Costs

​Check and Deduct Credits

​Error Handling

​Content Safety

​Performance Optimization

​Monitoring and Logging

​Troubleshooting

​Groq Pricing

​Next Steps

File Storage

API Reference

Quiz Features

Credit System

Build docs developers (and LLMs) love

Overview

Groq API Setup

Available AI Models

Primary Models

AI SDK Integration

Feature-Specific Configuration

Quiz Generation

Flashcard Generation

AI Tutor Chat

Audio Transcription

Rate Limiting

Database-Based Rate Limiting

Apply Rate Limits

Recommended Rate Limits

Credit System

Credit Costs

Check and Deduct Credits

Error Handling

Content Safety

Performance Optimization

Monitoring and Logging

Troubleshooting

Groq Pricing

Next Steps