Skip to main content

Overview

NL2FOL provides utilities to process datasets of natural language statements for logical fallacy detection. This guide shows you how to work with built-in datasets and create your own.

Dataset Structure

Datasets in NL2FOL use a simple CSV format with the following columns:
articles
string
required
The natural language statement or argument to analyze
label
integer
required
Binary label: 0 for logical fallacy, 1 for valid logical reasoning
source_article
string
Alternative field name for the text content (merged with articles)
sentence
string
Alternative field name for the text content (merged with articles)

Built-in Datasets

NL2FOL includes several pre-configured datasets in the data/ directory:

Logic Fallacies

File: data/fallacies.csvGeneral logical fallacies from various domains with labeled fallacy types.

Climate Fallacies

File: data/fallacies_climate.csvClimate change-related arguments with identified logical fallacies.

NLI Fallacies

File: data/nli_fallacies_test.csvNatural Language Inference-based fallacious reasoning examples.

NLI Entailments

File: data/nli_entailments_test.csvValid logical entailments for comparison and evaluation.

Using Built-in Datasets

The setup_dataset() function (defined in src/nl_to_fol.py:389) loads and prepares datasets:
import pandas as pd
from nl_to_fol import setup_dataset

# Load a balanced dataset of fallacies and valid arguments
df = setup_dataset(fallacy_set='logic', length=100)

print(df.head())
print(f"Dataset shape: {df.shape}")
print(f"Label distribution:\n{df['label'].value_counts()}")

Available Dataset Types

General logical fallacies dataset
df = setup_dataset(fallacy_set='logic', length=100)
Loads from data/fallacies.csv and data/nli_entailments_test.csv, creating a balanced dataset with:
  • 100 fallacious arguments (label=0)
  • 100 valid arguments (label=1)

Processing a Dataset

Here’s a complete example of processing a dataset:
process_dataset.py
import pandas as pd
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from nl_to_fol import NL2FOL, setup_dataset

def process_custom_dataset():
    # Initialize models (GPT-4 example)
    model_type = 'gpt'
    pipeline = None
    tokenizer = None
    
    nli_model_name = "microsoft/deberta-large-mnli"
    nli_tokenizer = AutoTokenizer.from_pretrained(nli_model_name)
    nli_model = AutoModelForSequenceClassification.from_pretrained(nli_model_name)
    
    # Load dataset
    df = setup_dataset(fallacy_set='logic', length=10)
    
    # Storage for results
    claims = []
    implications = []
    final_lfs = []
    final_lfs2 = []
    
    # Process each row
    for i, row in df.iterrows():
        print(f"Processing {i+1}/{len(df)}...")
        
        nl2fol = NL2FOL(
            sentence=row['articles'],
            model_type=model_type,
            pipeline=pipeline,
            tokenizer=tokenizer,
            nli_model=nli_model,
            nli_tokenizer=nli_tokenizer,
            debug=False  # Set to True for detailed output
        )
        
        lf1, lf2 = nl2fol.convert_to_first_order_logic()
        
        claims.append(nl2fol.claim)
        implications.append(nl2fol.implication)
        final_lfs.append(lf1)
        final_lfs2.append(lf2)
    
    # Add results to dataframe
    df['Claim'] = claims
    df['Implication'] = implications
    df['Logical Form 1'] = final_lfs
    df['Logical Form 2'] = final_lfs2
    
    # Save results
    df.to_csv('results/processed_dataset.csv', index=False)
    print("\nProcessing complete! Results saved to results/processed_dataset.csv")
    
    return df

if __name__ == "__main__":
    results = process_custom_dataset()
    print(results[['articles', 'label', 'Claim', 'Implication']].head())
The setup_dataset() function automatically balances the dataset by sampling an equal number of fallacies and valid arguments.

Creating Custom Datasets

1

Prepare your CSV file

Create a CSV file with the required columns:
custom_fallacies.csv
articles,label,fallacy_type
"All politicians lie. Sarah is a politician. Therefore Sarah lies.",0,hasty_generalization
"If it rains, the ground is wet. The ground is wet. Therefore it rained.",0,affirming_consequent
"All mammals have lungs. Whales are mammals. Therefore whales have lungs.",1,valid_syllogism
"Either we ban cars or pollution will kill us all.",0,false_dilemma
2

Load your dataset

Use pandas to load and prepare your data:
import pandas as pd

# Load custom dataset
df = pd.read_csv('custom_fallacies.csv')

# Ensure required columns exist
if 'articles' not in df.columns:
    df['articles'] = df['sentence']  # or other text column

if 'label' not in df.columns:
    df['label'] = 0  # default to fallacy if not specified

print(f"Loaded {len(df)} examples")
3

Process your dataset

Use the same processing loop as shown above:
for i, row in df.iterrows():
    nl2fol = NL2FOL(
        sentence=row['articles'],
        model_type='gpt',
        pipeline=None,
        tokenizer=None,
        nli_model=nli_model,
        nli_tokenizer=nli_tokenizer
    )
    
    lf1, lf2 = nl2fol.convert_to_first_order_logic()
    # Store results...

Batch Processing with Command Line

For large datasets, use the command-line interface:
python src/nl_to_fol.py \
  --model_name gpt-4o \
  --nli_model_name microsoft/deberta-large-mnli \
  --run_name my_experiment \
  --length 500 \
  --dataset logic

Command-Line Arguments

--model_name
string
required
Model name for text generation (gpt-4o, meta-llama/Llama-2-13b-hf, etc.)
--nli_model_name
string
required
HuggingFace model name for NLI (e.g., microsoft/deberta-large-mnli)
--run_name
string
required
Name for the output CSV file (saved to results/{run_name}.csv)
--length
integer
required
Number of examples to process from each class (total will be 2x this)
--dataset
string
required
Dataset type: logic, logicclimate, nli, or folio

Output Format

Processed datasets are saved with the following additional columns:
Output Columns:
- Claim: Extracted claim from the sentence
- Implication: Extracted implication
- Referring Expressions - Claim: Entities in the claim
- Referring Expressions - Implication: Entities in the implication
- Property Implications: Property relationships
- Equal Entities: Entity equivalences
- Subset Entities: Subset relationships
- Claim Lfs: Logical form of the claim
- Implication Lfs: Logical form of the implication
- Logical Form: Final first-order logic formula (method 1)
- Logical Form 2: Final first-order logic formula (method 2)

Example Output

articles,label
"All birds fly. Penguins are birds. Thus penguins fly.",0

Dataset Statistics

Analyze your processed dataset:
import pandas as pd

df = pd.read_csv('results/my_experiment.csv')

print("Dataset Statistics:")
print(f"Total examples: {len(df)}")
print(f"Fallacies: {(df['label'] == 0).sum()}")
print(f"Valid arguments: {(df['label'] == 1).sum()}")
print(f"\nSuccessful conversions: {df['Logical Form'].notna().sum()}")
print(f"Failed conversions: {df['Logical Form'].isna().sum()}")

# Average formula complexity
df['formula_length'] = df['Logical Form'].str.len()
print(f"\nAverage formula length: {df['formula_length'].mean():.1f} characters")

Working with Multiple Datasets

Combine multiple datasets for comprehensive analysis:
import pandas as pd
from nl_to_fol import setup_dataset

# Load multiple dataset types
logic_df = setup_dataset(fallacy_set='logic', length=100)
climate_df = setup_dataset(fallacy_set='logicclimate', length=50)
nli_df = setup_dataset(fallacy_set='nli', length=150)

# Add source labels
logic_df['source'] = 'logic'
climate_df['source'] = 'climate'
nli_df['source'] = 'nli'

# Combine
combined_df = pd.concat([logic_df, climate_df, nli_df], ignore_index=True)

print(f"Combined dataset size: {len(combined_df)}")
print(combined_df['source'].value_counts())

Performance Considerations

Processing large datasets can be time-consuming:
  • GPT-4: ~15-20 seconds per example (API rate limits apply)
  • Llama 13B: ~5-10 seconds per example (GPU-dependent)
For a 1000-example dataset:
  • GPT-4: ~4-6 hours
  • Llama 13B: ~2-3 hours

Optimization Tips

1

Batch processing

Process datasets in smaller batches and save intermediate results:
batch_size = 50
for i in range(0, len(df), batch_size):
    batch_df = df.iloc[i:i+batch_size]
    # Process batch...
    batch_df.to_csv(f'results/batch_{i}.csv', index=False)
2

Enable multiprocessing

For Llama models, process multiple examples in parallel if you have multiple GPUs.
3

Cache results

Store intermediate results to avoid reprocessing on failures:
if os.path.exists('cache/intermediate.csv'):
    df = pd.read_csv('cache/intermediate.csv')
4

Use debug mode selectively

Only enable debug=True for troubleshooting, not production runs.

Next Steps

SMT Solving

Convert logical forms to SMT and verify with CVC5

Evaluation

Measure accuracy and performance metrics

Model Backends

Choose the right model for your dataset

API Reference

Explore advanced configuration options

Build docs developers (and LLMs) love