Natural Language to First-Order Logic

Overview

The nl_to_fol.py script converts natural language arguments into first-order logic (FOL) formulas. This is the first and most critical step in the NL2FOL pipeline for detecting logical fallacies.

How It Works

The translation process follows a multi-stage pipeline:

Claim and Implication Extraction

The system identifies the claim and implication components of the argument using LLM prompts.

Referring Expression Identification

Extracts referring expressions (entities) from both the claim and implication.

Entity Relationship Detection

Determines relationships between entities (equal, subset, or unrelated) using NLI models or LLM reasoning.

Entity Mapping

Maps entities to logical variables (a, b, c, etc.) for formal representation.

Property Extraction

Identifies properties and predicates associated with each entity.

FOL Formula Generation

Generates first-order logic formulas for claim and implication with proper quantifiers.

Final Formula Assembly

Combines claim and implication into a final formula of the form: (Claim) → (Implication)

Command Usage

Basic Command

python3 src/nl_to_fol.py \
  --model_name <your_model_name> \
  --nli_model_name <your_nli_model_name> \
  --run_name <run_name> \
  --dataset <logic or logicclimate> \
  --length <number_of_datapoints>

Parameters

model_name

string

required

The LLM model to use for translation. Supports:

HuggingFace model paths (e.g., meta-llama/Llama-2-7b-hf)
OpenAI models (e.g., gpt-4o) - prefix with ‘gpt’ to use GPT mode

nli_model_name

string

The Natural Language Inference model for entity relationship detection. Example: microsoft/deberta-v2-xlarge-mnli

run_name

string

required

Name for this run. Output will be saved to results/<run_name>.csv

dataset

string

required

Dataset to use for testing. Options:

logic - Standard logical fallacy dataset
logicclimate - Climate-specific logical fallacy dataset
folio - FOLIO dataset for evaluation
nli - NLI-based dataset

length

integer

required

Number of datapoints to sample from the dataset for processing.

Example Usage

python3 src/nl_to_fol.py \
  --model_name meta-llama/Llama-2-13b-hf \
  --nli_model_name microsoft/deberta-v2-xlarge-mnli \
  --run_name llama_experiment_1 \
  --dataset logic \
  --length 100

Output Format

The script generates a CSV file in results/<run_name>.csv with the following columns:

Column	Description
`articles`	Original natural language text
`label`	Ground truth label (0=fallacy, 1=valid)
`Claim`	Extracted claim statement
`Implication`	Extracted implication statement
`Referring Expressions - Claim`	Entities in the claim
`Referring Expressions - Implication`	Entities in the implication
`Property Implications`	Relationships between properties
`Equal Entities`	Entities determined to be equal
`Subset Entities`	Subset relationships between entities
`Claim Lfs`	Logical form of the claim
`Implication Lfs`	Logical form of the implication
`Logical Form`	Final FOL formula (version 1)
`Logical Form 2`	Final FOL formula (version 2, with existential quantifiers)

Implementation Details

NL2FOL Class

The core translation logic is implemented in the NL2FOL class:

nl2fol = NL2FOL(
    sentence=text,
    model_type='gpt',  # or 'llama'
    pipeline=pipeline,
    tokenizer=tokenizer,
    nli_model=nli_model,
    nli_tokenizer=nli_tokenizer,
    debug=True
)

final_lf1, final_lf2 = nl2fol.convert_to_first_order_logic()

Debug Mode

The script runs with debug=True by default, printing intermediate results:

Extracted claim and implication
Referring expressions
Entity relationships
Property assignments
Intermediate logical forms

Processing time varies based on:

Model size and type (LLaMA models require GPU)
Dataset length
Complexity of arguments

Expect ~30-60 seconds per datapoint for LLM-based processing.

GPU Requirements

LLaMA models require GPU acceleration. The script automatically detects CUDA availability and displays GPU information at startup.Minimum recommended: 16GB GPU memory for 7B models, 24GB+ for 13B models.

Quantifier Strategies

The system generates two versions of the logical form: Logical Form 1: Uses heuristics for quantifier assignment based on entity relationships

Subset entities: subset gets exists, superset gets forall
Property implications: universally quantified

Logical Form 2: More conservative approach

All entity variables get existential quantifiers
Property implications remain universally quantified

Logical Form 2 is typically used for SMT solving as it provides better satisfiability checking results.

Prompt Engineering

The system uses specialized prompts located in prompts/:

prompt_nl_to_ci.txt - Claim/implication extraction
prompt_referring_expressions.txt - Entity identification
prompt_entity_relation.txt - Entity relationship detection
prompt_properties.txt - Property extraction
prompt_fol.txt - FOL formula generation

Troubleshooting

Common Issues

CUDA Out of Memory

Reduce model size or batch processing
Use model quantization or CPU offloading

Empty Results

Check that prompts directory exists in working directory
Verify model has necessary permissions and access

NLI Model Errors

Ensure NLI model is compatible (sequence classification models)
Check tokenizer compatibility

Next Steps

After generating FOL formulas:

Convert to SMT format using fol_to_cvc.py
Run SMT solver for satisfiability checking
Interpret results and evaluate performance

See FOL to SMT Conversion for the next step in the pipeline.

Get Started

Concepts

Usage Guide

Examples

Natural Language to First-Order Logic

Overview

How It Works

Command Usage

Basic Command

Parameters

Example Usage

Output Format

Implementation Details

NL2FOL Class

Debug Mode

GPU Requirements

Quantifier Strategies

Prompt Engineering

Troubleshooting

Common Issues

Next Steps

Build docs developers (and LLMs) love

Get Started

Concepts

Usage Guide

Examples

​Overview

​How It Works

​Command Usage

​Basic Command

​Parameters

​Example Usage

​Output Format

​Implementation Details

​NL2FOL Class

​Debug Mode

​GPU Requirements

​Quantifier Strategies

​Prompt Engineering

​Troubleshooting

​Common Issues

​Next Steps

Build docs developers (and LLMs) love

Overview

How It Works

Command Usage

Basic Command

Parameters

Example Usage

Output Format

Implementation Details

NL2FOL Class

Debug Mode

GPU Requirements

Quantifier Strategies

Prompt Engineering

Troubleshooting

Common Issues

Next Steps