Skip to main content

Overview

The SMTResults class interprets the output from SMT solvers (like CVC5) to determine whether a first-order logic formula represents a valid statement or contains a logical fallacy. It also uses LLMs to generate human-readable counterexamples when fallacies are detected.

Class Definition

class SMTResults:
    def __init__(self, output_file_path, sentence_data_path)

Parameters

output_file_path
str
required
Path to the SMT solver output file containing SAT/UNSAT result and model
sentence_data_path
str
required
Path to JSON file containing original sentence data with keys:
  • Claim: The claim portion of the sentence
  • Implication: The implication portion
  • Referring expressions: Entities identified
  • Properties: Properties of entities
  • Formula: The generated FOL formula

Instance Attributes

formula_type
str
Classification of the formula. Values:
  • "valid statement": Formula is satisfiable (UNSAT in negated form)
  • "logical fallacy": Formula is unsatisfiable (SAT in negated form)
counter_example
str | None
Human-readable counterexample generated by LLM (only for logical fallacies)

How It Works

The SMT solver receives the negation of the original formula. The interpretation logic:
  1. UNSAT Result: Negation is unsatisfiable → Original formula is valid
  2. SAT Result: Negation is satisfiable → Original formula is invalid (fallacy)
When the solver returns SAT, it provides a model (counterexample) showing variable assignments that make the negated formula true, thereby disproving the original claim.

Methods

get_results

def get_results(get_interpretation=True)
Prints the interpretation of SMT results to stdout. Parameters:
get_interpretation
bool
default:"True"
Whether to print the counterexample explanation for fallacies
Output Format:
The given statement is a <formula_type>
[If fallacy: <counter_example>]
results = SMTResults("output.txt", "sentence_data.json")
results.get_results()
# Output: The given statement is a valid statement

Input File Formats

SMT Solver Output File

Expected format:
sat
(model
  (define-fun x () BoundSet @uc_BoundSet_0)
  (define-fun P ((x!0 BoundSet)) Bool (= x!0 @uc_BoundSet_0))
)
Or:
unsat

Sentence Data JSON File

Required structure:
{
  "Claim": "All cats are animals",
  "Implication": "Therefore, all animals are cats",
  "Referring expressions": "cats, animals",
  "Properties": "Cat(x), Animal(x)",
  "Formula": "forall x (Cat(x) -> Animal(x)) -> forall y (Animal(y) -> Cat(y))"
}

Counterexample Generation

When a fallacy is detected, the class:
  1. Loads the original sentence data from JSON
  2. Reads the prompt template from prompt_counter_example.txt
  3. Formats the prompt with:
    • Original claim and implication
    • Referring expressions and properties
    • The FOL formula
    • The SMT solver’s model (counterexample)
  4. Sends to LLM via get_llm_result()
  5. Stores the natural language explanation in self.counter_example
The get_llm_result() function must be defined in llm.py module. Ensure this module is available and properly configured with API credentials.

Complete Usage Example

import json
import subprocess
from nl_to_fol import NL2FOL
from cvc import CVCGenerator
from interpret_smt_results import SMTResults

# Step 1: Convert natural language to FOL
sentence = "All politicians are corrupt. John is a politician. Therefore, everyone is corrupt."
nl2fol = NL2FOL(
    sentence=sentence,
    model_type='gpt',
    pipeline=None,
    tokenizer=None,
    nli_model=nli_model,
    nli_tokenizer=nli_tokenizer
)
fol_formula, _ = nl2fol.convert_to_first_order_logic()

# Step 2: Generate CVC5 script
generator = CVCGenerator(fol_formula)
cvc_script = generator.generateCVCScript()

with open("formula.smt2", "w") as f:
    f.write(cvc_script)

# Step 3: Run SMT solver
result = subprocess.run(
    ["cvc5", "formula.smt2"],
    capture_output=True,
    text=True
)

with open("solver_output.txt", "w") as f:
    f.write(result.stdout)

# Step 4: Save sentence data
sentence_data = {
    "Claim": nl2fol.claim,
    "Implication": nl2fol.implication,
    "Referring expressions": nl2fol.claim_ref_exp,
    "Properties": nl2fol.claim_properties,
    "Formula": fol_formula
}

with open("sentence_data.json", "w") as f:
    json.dump(sentence_data, f)

# Step 5: Interpret results
results = SMTResults("solver_output.txt", "sentence_data.json")
results.get_results()

# Access results programmatically
if results.formula_type == "logical fallacy":
    print(f"\nCounterexample:\n{results.counter_example}")

Command-Line Usage

python interpret_smt_results.py <output_file> <sentence_data_json>
python interpret_smt_results.py solver_output.txt sentence_data.json

Error Handling

Integration with Evaluation

import pandas as pd
from pathlib import Path

def evaluate_dataset(results_dir):
    """
    Evaluate a batch of SMT results
    """
    results_dir = Path(results_dir)
    evaluations = []
    
    for output_file in results_dir.glob("*_output.txt"):
        # Find corresponding sentence data
        base_name = output_file.stem.replace("_output", "")
        json_file = results_dir / f"{base_name}_data.json"
        
        if not json_file.exists():
            continue
        
        # Interpret results
        results = SMTResults(str(output_file), str(json_file))
        
        evaluations.append({
            "file": base_name,
            "type": results.formula_type,
            "has_counterexample": results.counter_example is not None
        })
    
    return pd.DataFrame(evaluations)

# Run evaluation
df = evaluate_dataset("results/")
print(df['type'].value_counts())

Dependencies

The module requires:
  • llm.py: Must define get_llm_result(prompt) function
  • json: Standard library for JSON parsing
  • sys: Standard library for command-line arguments
  • prompt_counter_example.txt: Prompt template file
Ensure prompt_counter_example.txt exists in the working directory with appropriate placeholders for claim, implication, referring expressions, properties, formula, and counter model.

See Also

Build docs developers (and LLMs) love