Overview
Thehelpers module provides utility functions for processing strings, extracting symbols from logical formulas, and manipulating predicate expressions. These functions are used throughout the NL2FOL pipeline.
String Processing
label_values
input_string(str): Comma-separated string of valuesmap(dict | str): Mapping dictionary or string representation of dict
str - Labeled string in format "value1: a, value2: b"
first_non_empty_line
string(str): Input string with potential multiple lines
str | None - First non-empty line, or None if all lines are empty
This function is particularly useful for extracting clean output from LLM responses that may include leading whitespace or empty lines.
remove_text_after_last_parenthesis
input_string(str): String to process
str - String up to and including the last ), or original string if no ) found
Logical Formula Processing
extract_propositional_symbols
logical_form(str): Logical formula string
set[str] - Set of single-character variable names (a-z)
split_string_except_in_brackets
string(str): String to splitdelimiter(str): Delimiter character
list[str] - List of split segments
This function is essential for parsing comma-separated predicates where commas inside predicate arguments should not trigger splits.
Predicate Manipulation
fix_inconsistent_arities
clauses1(list[str]): First list of predicate clausesclauses2(list[str]): Second list of predicate clauses
tuple[str, str] - Two comma-separated strings with consistent arities
Algorithm:
- Extract all predicates and their arities from both lists
- For predicates with multiple arities, keep the minimum
- Truncate arguments in clauses that exceed the minimum arity
replace_variables
mapping(dict): Dictionary mapping entity names to variable lettersclause(str): Predicate clause with variables
str - Clause with variables replaced by entity names
Process:
- Reverses the mapping (variable → entity)
- Parses predicate name and arguments
- Replaces variables in arguments with entity names
- Reconstructs the clause
substitute_variables
clause1(str): First clauseclause2(str): Second clausestart_char(str): Starting character for new variable names
tuple[str, str, str] - (substituted_clause1, substituted_clause2, next_char)
Process:
- Creates a mapping for variables encountered
- Assigns new variables starting from
start_char - Replaces variables in both clauses
- Returns next available character for further use
Variables that appear in both clauses are assigned the same new variable name, preserving their identity across clauses.
Usage in Pipeline
These helper functions are used throughout the NL2FOL pipeline:Type Handling
Best Practices
-
Arity Consistency: Always use
fix_inconsistent_arities()before comparing properties across claim and implication -
Variable Extraction: Use
extract_propositional_symbols()to find quantifiable variables in generated formulas -
Predicate Parsing: Use
split_string_except_in_brackets()when splitting comma-separated predicates to avoid breaking arguments -
Clean LLM Output: Apply
first_non_empty_line()to LLM responses to get clean, usable text -
Formula Cleanup: Use
remove_text_after_last_parenthesis()to remove trailing text from LLM-generated formulas
Dependencies
The module requires only Python standard library:ast: For safe evaluation of string-represented dictionariesre: For regular expression pattern matching
See Also
- NL2FOL - Main translation class using these helpers
- CVCGenerator - Formula parsing and generation
- SMTResults - Result interpretation
