Overview
Thenl_to_fol.py script converts natural language arguments into first-order logic (FOL) formulas. This is the first and most critical step in the NL2FOL pipeline for detecting logical fallacies.
How It Works
The translation process follows a multi-stage pipeline:Claim and Implication Extraction
The system identifies the claim and implication components of the argument using LLM prompts.
Referring Expression Identification
Extracts referring expressions (entities) from both the claim and implication.
Entity Relationship Detection
Determines relationships between entities (equal, subset, or unrelated) using NLI models or LLM reasoning.
FOL Formula Generation
Generates first-order logic formulas for claim and implication with proper quantifiers.
Command Usage
Basic Command
Parameters
The LLM model to use for translation. Supports:
- HuggingFace model paths (e.g.,
meta-llama/Llama-2-7b-hf) - OpenAI models (e.g.,
gpt-4o) - prefix with ‘gpt’ to use GPT mode
The Natural Language Inference model for entity relationship detection.
Example:
microsoft/deberta-v2-xlarge-mnliName for this run. Output will be saved to
results/<run_name>.csvDataset to use for testing. Options:
logic- Standard logical fallacy datasetlogicclimate- Climate-specific logical fallacy datasetfolio- FOLIO dataset for evaluationnli- NLI-based dataset
Number of datapoints to sample from the dataset for processing.
Example Usage
Output Format
The script generates a CSV file inresults/<run_name>.csv with the following columns:
| Column | Description |
|---|---|
articles | Original natural language text |
label | Ground truth label (0=fallacy, 1=valid) |
Claim | Extracted claim statement |
Implication | Extracted implication statement |
Referring Expressions - Claim | Entities in the claim |
Referring Expressions - Implication | Entities in the implication |
Property Implications | Relationships between properties |
Equal Entities | Entities determined to be equal |
Subset Entities | Subset relationships between entities |
Claim Lfs | Logical form of the claim |
Implication Lfs | Logical form of the implication |
Logical Form | Final FOL formula (version 1) |
Logical Form 2 | Final FOL formula (version 2, with existential quantifiers) |
Implementation Details
NL2FOL Class
The core translation logic is implemented in theNL2FOL class:
Debug Mode
The script runs withdebug=True by default, printing intermediate results:
- Extracted claim and implication
- Referring expressions
- Entity relationships
- Property assignments
- Intermediate logical forms
Processing time varies based on:
- Model size and type (LLaMA models require GPU)
- Dataset length
- Complexity of arguments
GPU Requirements
Quantifier Strategies
The system generates two versions of the logical form: Logical Form 1: Uses heuristics for quantifier assignment based on entity relationships- Subset entities: subset gets
exists, superset getsforall - Property implications: universally quantified
- All entity variables get existential quantifiers
- Property implications remain universally quantified
Prompt Engineering
The system uses specialized prompts located inprompts/:
prompt_nl_to_ci.txt- Claim/implication extractionprompt_referring_expressions.txt- Entity identificationprompt_entity_relation.txt- Entity relationship detectionprompt_properties.txt- Property extractionprompt_fol.txt- FOL formula generation
Troubleshooting
Common Issues
CUDA Out of Memory- Reduce model size or batch processing
- Use model quantization or CPU offloading
- Check that prompts directory exists in working directory
- Verify model has necessary permissions and access
- Ensure NLI model is compatible (sequence classification models)
- Check tokenizer compatibility
Next Steps
After generating FOL formulas:- Convert to SMT format using
fol_to_cvc.py - Run SMT solver for satisfiability checking
- Interpret results and evaluate performance
