Skip to main content

Installation

Install the required dependencies:
pip install mistralai openai pydantic gradio pandas numpy
PAS2 requires Python 3.8 or higher. The system uses both Mistral and OpenAI APIs for hallucination detection.

API keys

PAS2 requires two API keys. Set them as environment variables:
export MISTRAL_API_KEY="your_mistral_api_key_here"
export OPENAI_API_KEY="your_openai_api_key_here"
For Hugging Face Spaces deployment, use HF_MISTRAL_API_KEY and HF_OPENAI_API_KEY in the Secrets tab.

Your first detection

Create a simple hallucination detection script:
from pas2 import PAS2

# Initialize the detector
detector = PAS2(
    mistral_api_key="your_mistral_key",  # Optional if set in environment
    openai_api_key="your_openai_key"     # Optional if set in environment
)

# Run hallucination detection
query = "Who was the first person to land on the moon?"
results = detector.detect_hallucination(query, n_paraphrases=3)

# Check the results
if results["hallucination_detected"]:
    print(f"⚠️  Hallucination detected with {results['confidence_score']:.0%} confidence")
    print(f"Summary: {results['summary']}")
    print(f"\nConflicting facts found: {len(results['conflicting_facts'])}")
else:
    print(f"✓ No hallucination detected ({results['confidence_score']:.0%} confidence)")
    print(f"Summary: {results['summary']}")
Each detection makes multiple API calls (1 for paraphrasing + N for responses + 1 for judging). Monitor your API usage accordingly.

Understanding the output

The detect_hallucination method returns a dictionary with detailed analysis:
{
    "original_query": "Who was the first person to land on the moon?",
    "original_response": "Neil Armstrong was the first person...",
    "paraphrased_queries": [
        "Who was the initial individual to set foot on the moon?",
        "Which person first landed on the lunar surface?",
        "Who made the first moon landing?"
    ],
    "paraphrased_responses": [
        "Neil Armstrong became the first human...",
        "The first person to land on the moon was Neil Armstrong...",
        "Neil Armstrong accomplished the first moon landing..."
    ],
    "hallucination_detected": False,
    "confidence_score": 0.95,
    "conflicting_facts": [],
    "reasoning": "All responses consistently identify Neil Armstrong...",
    "summary": "No factual inconsistencies found across responses"
}

Key fields

  • hallucination_detected: Boolean indicating if inconsistencies were found
  • confidence_score: Judge’s confidence (0.0 to 1.0) in the detection
  • conflicting_facts: List of specific factual contradictions found
  • reasoning: Detailed explanation from the judge model
  • summary: Concise analysis summary

Progress tracking

Track detection progress in real-time with a callback:
def progress_callback(stage, **kwargs):
    if stage == "generating_paraphrases":
        print("Generating paraphrases...")
    elif stage == "responses_progress":
        completed = kwargs.get("completed_responses", 0)
        total = kwargs.get("total_responses", 0)
        print(f"Getting responses: {completed}/{total}")
    elif stage == "judging":
        print("Analyzing for hallucinations...")
    elif stage == "complete":
        print("Detection complete!")

detector = PAS2(progress_callback=progress_callback)
results = detector.detect_hallucination("What is the capital of France?")

Customizing paraphrase count

Control the number of paraphrases generated (default is 3):
# More paraphrases = more thorough detection but slower
results = detector.detect_hallucination(
    "What year did World War II end?",
    n_paraphrases=5  # Generate 5 paraphrases
)
Each additional paraphrase increases detection time and API costs but may improve accuracy for complex queries.

Working with the judge model

PAS2 uses OpenAI’s o3-mini as the judge model. The judge is initialized automatically:
# From pas2.py lines 302-366
def judge_hallucination(self, 
                       original_query: str, 
                       original_response: str, 
                       paraphrased_queries: List[str], 
                       paraphrased_responses: List[str]) -> HallucinationJudgment:
    """
    Use OpenAI's o3-mini as a judge to detect hallucinations in the responses
    """
    # The judge analyzes all responses for factual inconsistencies
    response = self.openai_client.chat.completions.create(
        model=self.openai_model,  # "o3-mini"
        messages=[...],
        response_format={"type": "json_object"}
    )
The judge returns a structured HallucinationJudgment object:
class HallucinationJudgment(BaseModel):
    hallucination_detected: bool
    confidence_score: float  # Between 0-1
    conflicting_facts: List[Dict[str, Any]]
    reasoning: str
    summary: str

Error handling

Handle API errors and missing keys gracefully:
try:
    detector = PAS2(
        mistral_api_key=os.environ.get("MISTRAL_API_KEY"),
        openai_api_key=os.environ.get("OPENAI_API_KEY")
    )
    results = detector.detect_hallucination(query)
except ValueError as e:
    print(f"Configuration error: {e}")
except Exception as e:
    print(f"Detection failed: {e}")
PAS2 raises a ValueError if API keys are missing during initialization:
if not self.mistral_api_key:
    raise ValueError("Mistral API key is required. Set it via MISTRAL_API_KEY...")
if not self.openai_api_key:
    raise ValueError("OpenAI API key is required. Set it via OPENAI_API_KEY...")

Next steps

How it works

Deep dive into the paraphrase-based detection algorithm

API reference

Explore all methods and parameters

Web interface

Deploy the Gradio web UI

Feedback storage

Learn about persistent feedback collection

Build docs developers (and LLMs) love