Overview
MedMitra uses Groq for:- Medical Insights Generation: Case summaries, SOAP notes, and diagnoses
- Radiology Image Analysis: Vision AI using LLaVA model
- Structured Data Extraction: JSON-formatted medical information
- Real-time Processing: Fast inference for responsive user experience
Why Groq?
- Speed: 10-100x faster than traditional GPU inference
- Cost-effective: Competitive pricing per token
- Multiple Models: Access to Llama 3, LLaVA, and other open models
- Reliability: High uptime and consistent performance
Prerequisites
- A Groq account (sign up at console.groq.com)
- Python 3.9+ for backend integration
Setup Instructions
1. Get a Groq API Key
- Visit console.groq.com
- Sign up or log in to your account
- Navigate to API Keys section
- Click Create API Key
- Copy your API key (keep it secure!)
2. Configure Environment Variables
Add tobackend/.env:
3. Install Groq SDK
The Groq SDK is already included in the project dependencies:Models Used in MedMitra
Llama 3.3 70B Versatile
Purpose: Medical insights generation, SOAP notes, diagnosis- 70 billion parameters
- Excellent at structured output
- Strong medical knowledge
- Fast inference on Groq
LLaVA Vision Model
Purpose: Radiology image analysis (X-rays, MRIs, CT scans)- Multimodal (text + images)
- Specialized for vision tasks
- Can identify medical findings
- Returns structured JSON
Implementation Examples
LLM Manager
Location:backend/utils/llm_utils.py
The LLMManager class provides a reusable interface for Groq LLM operations:
Vision Agent
Location:backend/agents/vision_agent.py
The Vision Agent analyzes radiology images using Groq’s multimodal capabilities:
Medical Insights Agent
The Medical Insights Agent uses LangGraph with Groq to generate comprehensive medical analysis:Prompt Engineering
MedMitra uses specialized prompts for medical analysis:Radiology Analysis Prompt
Case Summary Prompt
Configuration Options
Temperature Settings
Token Limits
Streaming Responses
Best Practices
Prompt Engineering
Prompt Engineering
- Be specific about output format (JSON preferred)
- Include examples in prompts for better results
- Use system prompts to set context and constraints
- Test prompts with various inputs
Error Handling
Error Handling
- Wrap API calls in try-catch blocks
- Handle rate limits gracefully (429 errors)
- Implement retry logic with exponential backoff
- Validate JSON responses before using
Cost Optimization
Cost Optimization
- Monitor token usage in Groq console
- Cache responses when appropriate
- Use appropriate max_tokens limits
- Batch similar requests when possible
Medical Safety
Medical Safety
- Always review AI-generated diagnoses
- Include confidence scores in outputs
- Log all AI interactions for audit trail
- Never use AI as sole diagnostic tool
Rate Limits & Quotas
Groq has the following limits (as of 2024):- Free Tier: 30 requests/minute, 14,400 requests/day
- Paid Tier: Higher limits based on plan
- Token Limits: Varies by model
Troubleshooting
API Key Issues
API Key Issues
Error: Invalid API key
- Verify GROQ_API_KEY in .env file
- Check key hasn’t been revoked
- Ensure no extra spaces in key
- Regenerate key if necessary
Rate Limiting
Rate Limiting
Error: 429 Too Many Requests
- Implement exponential backoff
- Add delays between requests
- Upgrade to paid tier for higher limits
- Cache responses to reduce API calls
Vision Model Issues
Vision Model Issues
Error: Image analysis failed
- Ensure image URL is publicly accessible
- Check image format is supported (JPG, PNG)
- Verify image size is within limits
- Check internet connectivity
JSON Parsing Errors
JSON Parsing Errors
Error: Invalid JSON in response
- Use extract_json_from_string utility
- Adjust prompt to emphasize JSON format
- Lower temperature for more structured output
- Add validation in prompt
Performance Metrics
Typical Groq performance for MedMitra:- Case Summary Generation: 1-2 seconds
- SOAP Note Generation: 1-3 seconds
- Diagnosis Generation: 2-4 seconds
- Radiology Image Analysis: 2-5 seconds
Groq’s LPU (Language Processing Unit) technology provides significantly faster inference than traditional GPU-based solutions.
Next Steps
LlamaParse Integration
Set up PDF document parsing
Medical Prompts
Explore medical prompt templates
