Skip to main content

Overview

The generate_paraphrases method creates multiple paraphrased versions of an input query while preserving the original meaning. This is a core component of the PAS2 methodology, enabling the detection of hallucinations through response consistency analysis.

Method signature

PAS2.generate_paraphrases(query: str, n_paraphrases: int = 3) -> List[str]

Parameters

query
str
required
The original query string to paraphrase.
n_paraphrases
int
default:"3"
Number of paraphrases to generate. The method will attempt to generate exactly this many variations.

Return value

return
List[str]
List of query strings with the original query as the first element, followed by n_paraphrases paraphrased versions.Total length: n_paraphrases + 1 (original + paraphrases)

Behavior

Successful generation

When the API call succeeds, the method:
  1. Sends a request to Mistral API with JSON formatting
  2. Parses the response to extract paraphrases
  3. Handles multiple JSON response structures (paraphrases, results, or direct array)
  4. Prepends the original query to the list
  5. Returns all queries for analysis

Fallback mechanism

If paraphrase generation fails (API error, parsing error, etc.), the method returns fallback paraphrases:
[
    query,  # original
    f"Could you tell me about {query.strip('?')}?",
    f"I'd like to know: {query}",
    f"Please provide information on {query.strip('?')}."
][:n_paraphrases+1]

Example usage

from pas2 import PAS2

detector = PAS2(
    mistral_api_key="your-key",
    openai_api_key="your-key"
)

# Generate 3 paraphrases
queries = detector.generate_paraphrases(
    query="Who was the first person to land on the moon?",
    n_paraphrases=3
)

for i, q in enumerate(queries):
    if i == 0:
        print(f"Original: {q}")
    else:
        print(f"Paraphrase {i}: {q}")

Example output

Original: Who was the first person to land on the moon?
Paraphrase 1: Which individual was the initial human to step foot on the lunar surface?
Paraphrase 2: Can you tell me the name of the first moon landing astronaut?
Paraphrase 3: Who achieved the historic milestone of being the first to walk on the moon?

API integration

The method uses the Mistral API with the following configuration:
response = self.mistral_client.chat.complete(
    model=self.mistral_model,  # "mistral-large-latest"
    messages=[
        {
            "role": "system",
            "content": f"You are an expert at creating semantically equivalent paraphrases. Generate {n_paraphrases} different paraphrases of the given query that preserve the original meaning but vary in wording and structure. Return a JSON array of strings, each containing one paraphrase."
        },
        {
            "role": "user",
            "content": query
        }
    ],
    response_format={"type": "json_object"}
)

Error handling

The method catches all exceptions during paraphrase generation and automatically falls back to predefined paraphrase templates. This ensures that the hallucination detection process can continue even if the paraphrase API fails.
Errors are logged with full stack traces:
logger.error("Error generating paraphrases: %s", str(e), exc_info=True)

Performance considerations

  • Single API call generates all paraphrases at once
  • Typical response time: 1-3 seconds depending on API latency
  • Uses JSON mode for structured output parsing
  • Handles multiple response formats for robustness
The original query is always included as the first element in the returned list. This ensures consistent comparison between the original response and paraphrased responses.

Build docs developers (and LLMs) love