Skip to main content

Overview

The fol_to_cvc.py script converts first-order logic formulas into SMT-LIB format and executes the CVC4 solver to check satisfiability. This determines whether logical arguments are valid or contain fallacies.

How It Works

1

Load FOL Formulas

Reads the CSV file generated by nl_to_fol.py containing logical formulas.
2

Generate SMT Scripts

Uses the CVCGenerator class to convert FOL formulas to SMT-LIB format.
3

Run CVC4 Solver

Executes the SMT solver on each formula to check satisfiability.
4

Handle Unknown Results

If solver returns “unknown”, retries with finite model finding enabled.
5

Interpret Results

Maps solver output to validity classification:
  • unsat = Valid argument
  • sat/unknown = Logical fallacy (LF)
6

Save Results

Outputs results to CSV with validity predictions.

Command Usage

Basic Command

python3 src/fol_to_cvc.py <run_name>

Parameters

run_name
string
required
The name of the run from nl_to_fol.py. The script will:
  • Read from: results/<run_name>.csv
  • Create SMT files in: results/<run_name>_smt/
  • Save results to: results/<run_name>_results.csv

Example Usage

python3 src/fol_to_cvc.py llama_experiment_1

Prerequisites

CVC4 Solver RequiredThe script requires the CVC4 SMT solver binary (./cvc4) in the working directory.Download from: https://github.com/CVC4/CVC4/releasesMake executable:
chmod +x cvc4

SMT-LIB Format

The FOL formulas are converted to SMT-LIB2 format. The system handles:

Operator Conversions

FOL SyntaxSMT-LIB Syntax
ForAllforall
ThereExistsexists
&and
~not
->Implication operator

Example Conversion

Input FOL:
forall a (Human(a) -> Mortal(a))
Output SMT-LIB:
(set-logic ALL)
(declare-sort Entity 0)
(declare-fun Human (Entity) Bool)
(declare-fun Mortal (Entity) Bool)

(assert (not 
  (forall ((a Entity))
    (=> (Human a) (Mortal a))
  )
))

(check-sat)
(get-model)

Output Files

For each datapoint i, the script generates:

SMT Files

results/<run_name>_smt/i.smt2 The SMT-LIB formatted formula file. results/<run_name>_smt/i_out.txt The solver output containing:
  • First line: sat, unsat, or unknown
  • Remaining lines: Counter-model (if sat)

Results CSV

results/<run_name>_results.csv Adds a result column to the original CSV:
  • Valid - Formula is unsatisfiable (valid argument)
  • LF - Formula is satisfiable or unknown (logical fallacy)
  • Empty string - Processing error

Finite Model Finding

When the solver returns unknown, the script automatically retries with finite model finding:
script = CVCGenerator(fol).generateCVCScript(
    finite_model_finding=True
)
This helps resolve formulas that are undecidable in the infinite domain.
Finite model finding can significantly increase solving time but improves coverage for complex formulas.

Understanding Results

UNSAT (Valid Argument)

When the SMT solver returns unsat, it means:
  • The negation of the formula is unsatisfiable
  • No counter-example exists
  • The original argument is logically valid

SAT (Logical Fallacy)

When the SMT solver returns sat, it means:
  • A counter-example exists
  • The argument contains a logical fallacy
  • The model shows why the implication doesn’t hold

Unknown

When the solver returns unknown:
  • The formula is too complex for automatic decision
  • Classified as potential fallacy (conservative approach)
  • Retry with finite model finding may resolve

Processing Flow

Performance Considerations

Processing Time
  • Simple formulas: < 1 second per formula
  • Complex formulas: 5-30 seconds
  • Formulas requiring FMF: 30-120 seconds
Consider parallel processing for large datasets.

Error Handling

The script includes comprehensive error handling:
try:
    # Generate and run SMT
    script = CVCGenerator(fol).generateCVCScript()
    # ...
except Exception as e:
    print("Cannot run this statement : {}".format(e))
    results.append("")  # Mark as error
    pass
Errors result in empty strings in the results column, which are excluded from evaluation metrics.

CVCGenerator Class

The CVCGenerator class (from cvc.py) handles:
  • FOL parsing
  • SMT-LIB syntax generation
  • Sort and function declarations
  • Formula negation (for validity checking)
The solver checks the negation of the formula. If the negation is unsatisfiable, the original formula is valid.

Troubleshooting

Common Issues

CVC4 not found
# Ensure cvc4 binary is in current directory or PATH
ls -l ./cvc4
Permission denied
chmod +x ./cvc4
Directory not found
# Ensure SMT output directory exists
mkdir -p results/<run_name>_smt/
Parsing errors
  • Check that FOL formulas are well-formed
  • Verify all operators are properly converted
  • Review Logical Form 2 column for malformed expressions

Alternative Solvers

While the script uses CVC4, you can adapt it for other SMT solvers: CVC5 (newer version)
./cvc5 --lang smt2 results/<run_name>_smt/0.smt2
Z3
z3 results/<run_name>_smt/0.smt2
Different solvers may produce different results on undecidable formulas. CVC4 is recommended for consistency with the paper’s results.

Next Steps

After generating SMT results:
  1. Interpret individual results with counter-examples using interpret_smt_results.py
  2. Evaluate overall performance using get_metrics.py
  3. Analyze error cases and improve FOL translation
See Interpreting Results for detailed result analysis.

Build docs developers (and LLMs) love