Overview
The Evaluator class handles AI-powered lead evaluation using Groq’s LLM API. It analyzes extracted content, matches services against a predefined catalog, generates fit scores, and provides personalized outreach recommendations. The class includes automatic validation, retry logic, and quota tracking.
Constructor
Evaluator()
model
string
default: "llama-3.3-70b-versatile"
The Groq model to use for evaluation. Supported models:
llama-3.3-70b-versatile (default, recommended)
llama-3.1-70b-versatile
mixtral-8x7b-32768
from evaluator import Evaluator
# Use default model
evaluator = Evaluator()
# Use specific model
evaluator = Evaluator( model = "mixtral-8x7b-32768" )
Environment Variables Required:
GROQ_API_KEY - Your Groq API key (required)
Configuration Files Required:
services/services.json - Service catalog for validation
prompts/system_prompt.md - System prompt template
Raises:
ValueError - If GROQ_API_KEY is not found in environment variables
The Evaluator requires a valid Groq API key. Sign up at console.groq.com to get your API key.
Class Properties
The Evaluator maintains class-level tracking across all instances:
status
string
default: "System Online"
Current system status. Updates to “Rate Limited / Quota Reached” on API errors.
Whether the API quota is available. Set to false on rate limit errors.
Timestamp of the last evaluation run in format “YYYY-MM-DD HH:MM:SS”
Cumulative token usage across all evaluations: Total completion tokens used
Methods
evaluate()
Analyzes content and returns a structured evaluation with service matching and scoring.
The website content to analyze. Typically extracted text from a webpage.
Additional context from the knowledge base to inform the evaluation. Can be a list of strings or a single string.
Number of retry attempts on failure. Total attempts will be retry_count + 1.
Returns:
Structured evaluation result with the following fields: Business category (e.g., “Local Service Business”, “E-commerce”, “SaaS”, “Healthcare”)
Primary service from the validated services catalog
Secondary service if applicable (must be from validated catalog)
Lead quality score from 0-100 indicating how well the business matches your target profile
AI-generated explanation for the fit score with specific details
Personalized outreach recommendation tailored to the business
Token usage for this specific evaluation Total tokens for this request
Raises:
Exception - If services.json cannot be loaded (CRITICAL error)
Exception - If system_prompt.md cannot be loaded (CRITICAL error)
Exception - If LLM returns invalid JSON after all retries
ValueError - If selected primary_service is not in the approved services catalog
ValueError - If selected secondary_service is not in the approved services catalog
Exception - If Groq API errors persist after all retries
The method validates all service selections against services.json. Invalid services will raise a ValueError to ensure data integrity.
Usage Examples
Basic Evaluation
from evaluator import Evaluator
import json
evaluator = Evaluator()
content = """
Welcome to Austin Premier Plumbing. We are a local plumbing company
serving Austin, Texas and surrounding areas. We offer emergency pipe repairs,
water heater installation, drain cleaning, and 24/7 emergency services.
Call us today for a free quote!
"""
try :
result = evaluator.evaluate(content)
print (json.dumps(result, indent = 2 ))
print ( f " \n Business: { result[ 'business_name' ] } " )
print ( f "Service: { result[ 'primary_service' ] } " )
print ( f "Score: { result[ 'fit_score' ] } /100" )
print ( f "Reasoning: { result[ 'reasoning' ] } " )
print ( f "Tokens Used: { result[ '_usage' ][ 'total_tokens' ] } " )
except Exception as e:
print ( f "Evaluation failed: { e } " )
Evaluation with RAG Context
from evaluator import Evaluator
from rag import RAG
evaluator = Evaluator()
rag = RAG()
content = "We help small businesses grow with digital marketing services."
# Retrieve relevant knowledge
rag_context = rag.retrieve(content, limit = 3 )
# Evaluate with additional context
result = evaluator.evaluate(content, rag_context = rag_context)
print ( f "Evaluation with { len (rag_context) } knowledge items" )
print ( f "Score: { result[ 'fit_score' ] } " )
print ( f "Outreach: { result[ 'outreach_angle' ] } " )
Custom Retry Logic
from evaluator import Evaluator
import time
evaluator = Evaluator()
content = "Medical clinic in downtown Yangon offering general practice services."
# Try up to 3 times (initial + 2 retries)
try :
result = evaluator.evaluate(content, retry_count = 2 )
print ( f "Success after potential retries: { result[ 'business_name' ] } " )
except Exception as e:
print ( f "Failed after all retries: { e } " )
Monitoring Token Usage
from evaluator import Evaluator
evaluator = Evaluator()
# Process multiple leads
leads = [
"Local restaurant in Bangkok" ,
"E-commerce store selling electronics" ,
"Software consulting company"
]
for lead_content in leads:
result = evaluator.evaluate(lead_content)
print ( f "Processed: { result[ 'business_name' ] } " )
print ( f "Request tokens: { result[ '_usage' ][ 'total_tokens' ] } " )
# Check cumulative usage
print ( f " \n Total Usage Across All Evaluations:" )
print ( f "Prompt Tokens: { Evaluator.total_usage[ 'prompt_tokens' ] } " )
print ( f "Completion Tokens: { Evaluator.total_usage[ 'completion_tokens' ] } " )
print ( f "Total Tokens: { Evaluator.total_usage[ 'total_tokens' ] } " )
Checking System Status
from evaluator import Evaluator
evaluator = Evaluator()
print ( f "System Status: { Evaluator.status } " )
print ( f "Quota OK: { Evaluator.quota_ok } " )
print ( f "Last Run: { Evaluator.last_run_time } " )
try :
result = evaluator.evaluate( "Test content" )
except Exception as e:
# Status updates automatically on rate limit errors
print ( f "Error: { e } " )
print ( f "Updated Status: { Evaluator.status } " )
print ( f "Quota OK: { Evaluator.quota_ok } " )
Service Validation
The Evaluator automatically validates all service selections against the services catalog:
Load Services Catalog
Loads services from services/services.json at evaluation time
Extract Valid Names
Recursively extracts all service names from the JSON structure
Validate Primary Service
Ensures primary_service matches a name in the catalog
Validate Secondary Service
If present, ensures secondary_service also matches the catalog
Raise on Mismatch
Throws ValueError if any service is not in the approved list
JSON Response Parsing
The Evaluator uses advanced parsing to handle Groq’s JSON output:
Requests response_format={"type": "json_object"} for structured output
Handles cases where Groq adds markdown code blocks around JSON
Extracts JSON from text by finding first { to last }
Validates JSON parsing before returning
The parser is robust against common LLM output variations, including markdown code blocks and surrounding text.
Configuration
Required Files
services/services.json
{
"categories" : [
{
"name" : "Home Services" ,
"services" : [
{ "name" : "Plumbing" },
{ "name" : "HVAC" },
{ "name" : "Electrical" }
]
}
]
}
prompts/system_prompt.md
You are an expert lead evaluator. Analyze the website content and provide a structured evaluation.
Available services:
[ SERVICES_JSON ]
Provide your response as JSON with these fields:
- business_name
- business_type
- primary_service
- secondary_service (optional)
- fit_score (0-100)
- reasoning
- outreach_angle
The [SERVICES_JSON] placeholder in the system prompt is automatically replaced with the contents of services.json.
Rate Limits and Quotas
Groq has rate limits that vary by plan:
Free tier: 30 requests/minute, 14,400 requests/day
Paid tier: Higher limits based on your plan
The Evaluator automatically tracks quota status and updates Evaluator.quota_ok on rate limit errors.
Typical evaluation metrics:
Latency : 1-3 seconds per evaluation
Tokens per request : 1,500-3,000 tokens (varies by content length)
Temperature : 0.1 (optimized for consistent, deterministic outputs)
LeadEngine - Orchestrates the full pipeline including Evaluator
RAG - Provides context to enhance evaluations
Extractor - Extracts content for evaluation