SMTResults

Overview

The SMTResults class interprets the output from SMT solvers (like CVC5) to determine whether a first-order logic formula represents a valid statement or contains a logical fallacy. It also uses LLMs to generate human-readable counterexamples when fallacies are detected.

Class Definition

class SMTResults:
    def __init__(self, output_file_path, sentence_data_path)

Parameters

output_file_path

str

required

Path to the SMT solver output file containing SAT/UNSAT result and model

sentence_data_path

str

required

Path to JSON file containing original sentence data with keys:

Claim: The claim portion of the sentence
Implication: The implication portion
Referring expressions: Entities identified
Properties: Properties of entities
Formula: The generated FOL formula

Instance Attributes

formula_type

str

Classification of the formula. Values:

"valid statement": Formula is satisfiable (UNSAT in negated form)
"logical fallacy": Formula is unsatisfiable (SAT in negated form)

counter_example

str | None

Human-readable counterexample generated by LLM (only for logical fallacies)

How It Works

The SMT solver receives the negation of the original formula. The interpretation logic:

UNSAT Result: Negation is unsatisfiable → Original formula is valid
SAT Result: Negation is satisfiable → Original formula is invalid (fallacy)

When the solver returns SAT, it provides a model (counterexample) showing variable assignments that make the negated formula true, thereby disproving the original claim.

Methods

get_results

def get_results(get_interpretation=True)

Prints the interpretation of SMT results to stdout. Parameters:

get_interpretation

bool

default:"True"

Whether to print the counterexample explanation for fallacies

Output Format:

The given statement is a <formula_type>
[If fallacy: <counter_example>]

results = SMTResults("output.txt", "sentence_data.json")
results.get_results()
# Output: The given statement is a valid statement

Input File Formats

SMT Solver Output File

Expected format:

sat
(model
  (define-fun x () BoundSet @uc_BoundSet_0)
  (define-fun P ((x!0 BoundSet)) Bool (= x!0 @uc_BoundSet_0))
)

Or:

unsat

Sentence Data JSON File

Required structure:

{
  "Claim": "All cats are animals",
  "Implication": "Therefore, all animals are cats",
  "Referring expressions": "cats, animals",
  "Properties": "Cat(x), Animal(x)",
  "Formula": "forall x (Cat(x) -> Animal(x)) -> forall y (Animal(y) -> Cat(y))"
}

Counterexample Generation

When a fallacy is detected, the class:

Loads the original sentence data from JSON
Reads the prompt template from prompt_counter_example.txt
Formats the prompt with:
- Original claim and implication
- Referring expressions and properties
- The FOL formula
- The SMT solver’s model (counterexample)
Sends to LLM via get_llm_result()
Stores the natural language explanation in self.counter_example

The get_llm_result() function must be defined in llm.py module. Ensure this module is available and properly configured with API credentials.

Complete Usage Example

import json
import subprocess
from nl_to_fol import NL2FOL
from cvc import CVCGenerator
from interpret_smt_results import SMTResults

# Step 1: Convert natural language to FOL
sentence = "All politicians are corrupt. John is a politician. Therefore, everyone is corrupt."
nl2fol = NL2FOL(
    sentence=sentence,
    model_type='gpt',
    pipeline=None,
    tokenizer=None,
    nli_model=nli_model,
    nli_tokenizer=nli_tokenizer
)
fol_formula, _ = nl2fol.convert_to_first_order_logic()

# Step 2: Generate CVC5 script
generator = CVCGenerator(fol_formula)
cvc_script = generator.generateCVCScript()

with open("formula.smt2", "w") as f:
    f.write(cvc_script)

# Step 3: Run SMT solver
result = subprocess.run(
    ["cvc5", "formula.smt2"],
    capture_output=True,
    text=True
)

with open("solver_output.txt", "w") as f:
    f.write(result.stdout)

# Step 4: Save sentence data
sentence_data = {
    "Claim": nl2fol.claim,
    "Implication": nl2fol.implication,
    "Referring expressions": nl2fol.claim_ref_exp,
    "Properties": nl2fol.claim_properties,
    "Formula": fol_formula
}

with open("sentence_data.json", "w") as f:
    json.dump(sentence_data, f)

# Step 5: Interpret results
results = SMTResults("solver_output.txt", "sentence_data.json")
results.get_results()

# Access results programmatically
if results.formula_type == "logical fallacy":
    print(f"\nCounterexample:\n{results.counter_example}")

Command-Line Usage

python interpret_smt_results.py <output_file> <sentence_data_json>

python interpret_smt_results.py solver_output.txt sentence_data.json

Error Handling

Show File Not Found

If either input file doesn’t exist, Python will raise FileNotFoundError:

FileNotFoundError: [Errno 2] No such file or directory: 'output.txt'

Ensure both files exist before creating SMTResults instance.

Show Invalid JSON Format

If sentence_data.json is malformed:

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes

Validate JSON structure matches expected format.

Show Missing JSON Keys

If required keys are missing from sentence data:

KeyError: 'Claim'

Ensure all required keys are present: Claim, Implication, Referring expressions, Properties, Formula.

Show Invalid Solver Output

If solver output doesn’t start with “sat” or “unsat”:

ValueError: not enough values to unpack

Verify the solver output file format is correct.

Integration with Evaluation

import pandas as pd
from pathlib import Path

def evaluate_dataset(results_dir):
    """
    Evaluate a batch of SMT results
    """
    results_dir = Path(results_dir)
    evaluations = []
    
    for output_file in results_dir.glob("*_output.txt"):
        # Find corresponding sentence data
        base_name = output_file.stem.replace("_output", "")
        json_file = results_dir / f"{base_name}_data.json"
        
        if not json_file.exists():
            continue
        
        # Interpret results
        results = SMTResults(str(output_file), str(json_file))
        
        evaluations.append({
            "file": base_name,
            "type": results.formula_type,
            "has_counterexample": results.counter_example is not None
        })
    
    return pd.DataFrame(evaluations)

# Run evaluation
df = evaluate_dataset("results/")
print(df['type'].value_counts())

Dependencies

The module requires:

llm.py: Must define get_llm_result(prompt) function
json: Standard library for JSON parsing
sys: Standard library for command-line arguments
prompt_counter_example.txt: Prompt template file

Ensure prompt_counter_example.txt exists in the working directory with appropriate placeholders for claim, implication, referring expressions, properties, formula, and counter model.

Core Classes

Utilities

Overview

Class Definition

Parameters

Instance Attributes

How It Works

Methods

get_results

Input File Formats

SMT Solver Output File

Sentence Data JSON File

Counterexample Generation

Complete Usage Example

Command-Line Usage

Error Handling

Integration with Evaluation

Dependencies

See Also

Build docs developers (and LLMs) love

Core Classes

Utilities

​Overview

​Class Definition

​Parameters

​Instance Attributes

​How It Works

​Methods

​get_results

​Input File Formats

​SMT Solver Output File

​Sentence Data JSON File

​Counterexample Generation

​Complete Usage Example

​Command-Line Usage

​Error Handling

​Integration with Evaluation

​Dependencies

​See Also

Build docs developers (and LLMs) love

Overview

Class Definition

Parameters

Instance Attributes

How It Works

Methods

get_results

Input File Formats

SMT Solver Output File

Sentence Data JSON File

Counterexample Generation

Complete Usage Example

Command-Line Usage

Error Handling

Integration with Evaluation

Dependencies

See Also