fc.semantic.*.
map
Applies a generation prompt to one or more columns, enabling rich summarization and generation tasks.A Jinja2 template for the generation prompt. References column values using
{{ column_name }} syntax. Each placeholder is replaced with the corresponding value from the current row during execution.If True, when any of the provided columns has a None value for a row, the entire row’s output will be None (template is not rendered). If False, None values are handled using Jinja2’s null rendering behavior.
Optional few-shot examples to guide the model’s output format and style.
Optional Pydantic model to enforce structured output. Must include descriptions for each field.
Optional language model alias. If None, uses the default model.
Language model temperature.
Maximum tokens to generate.
Optional timeout in seconds for a single LLM request. If None, uses the default timeout (120 seconds).
Named column arguments that correspond to template variables. Keys must match the variable names used in the template.
A column expression representing the semantic mapping operation.
Examples
extract
Extracts structured information from unstructured text using a provided Pydantic model schema.Column containing text to extract from.
A Pydantic model type that defines the output structure with descriptions for each field.
Optional parameter to constrain the model to generate at most this many tokens.
Optional temperature parameter for the language model.
Optional alias for the language model to use for the extraction. If None, will use the language model configured as the default.
Optional timeout in seconds for a single LLM request. If None, uses the default timeout (120 seconds).
A new column with structured values (a struct) based on the provided schema.
Example
Extracting knowledge graph triples
predicate
Applies a boolean predicate to one or more columns, typically used for filtering.A Jinja2 template containing a yes/no question or boolean claim. Should reference column values using
{{ column_name }} syntax. The model will evaluate this condition for each row and return True or False.If True, when any of the provided columns has a None value for a row, the entire row’s output will be None.
Optional few-shot examples showing how to evaluate the predicate.
Optional language model alias. If None, uses the default model.
Language model temperature.
Optional timeout in seconds for a single LLM request.
Named column arguments that correspond to template variables.
A boolean column expression.
Example
Filtering wireless products
reduce
Aggregate function: reduces a set of strings in a column to a single string using a natural language instruction.A string containing the semantic.reduce prompt. The instruction can optionally include Jinja2 template variables (e.g.,
{{variable}}) that reference columns from the group_context parameter.The column containing documents/strings to reduce.
Optional dictionary mapping variable names to columns. These columns provide context for each group and can be referenced in the instruction template.
Optional list of columns to sort grouped documents by before reduction.
Optional alias for the language model to use. If None, uses the default model.
Temperature parameter for the language model.
Maximum tokens the model can generate.
Optional timeout in seconds for a single LLM request.
A column expression representing the semantic reduction operation.
Examples
classify
Classifies a string column into one of the provided classes.Column or column name containing text to classify.
List of class labels or ClassDefinition objects defining the available classes. Use ClassDefinition objects to provide descriptions for the classes.
Optional collection of example classifications to guide the model.
Optional alias for the language model to use.
Optional temperature parameter for the language model.
Optional timeout in seconds for a single LLM request.
Expression containing the classification results.
Examples
analyze_sentiment
Analyzes the sentiment of a string column. Returns one of ‘positive’, ‘negative’, or ‘neutral’.Column or column name containing text for sentiment analysis.
Optional alias for the language model to use.
Optional temperature parameter for the language model.
Optional timeout in seconds for a single LLM request.
Expression containing sentiment results (‘positive’, ‘negative’, or ‘neutral’).
Example
embed
Generate embeddings for the specified string column.Column or column name containing the values to generate embeddings for.
Optional alias for the embedding model to use. If None, will use the embedding model configured as the default.
A Column expression that represents the embeddings for each value in the input column.
Example
summarize
Summarizes strings from a column.Column or column name containing text for summarization.
Format of the summary to generate. Can be either KeyPoints or Paragraph. If None, will default to Paragraph with a maximum of 120 words.
Optional temperature parameter for the language model.
Optional alias for the language model to use for the summarization.
Optional timeout in seconds for a single LLM request.
Expression containing the summarized string.
Example
parse_pdf
Parses a column of PDF paths into markdown.Column or column name containing the PDF to parse.
Optional alias for the language model to use for the parsing.
Optional page separator to use for the parsing. If the separator includes the
{page} placeholder, the model will replace it with the current page number.Flag to describe images in the PDF. If True, the prompt will ask the model to include a description of the image in the markdown output. If False, the prompt asks the model to ignore images that aren’t tables or charts.
Optional maximum number of output tokens per ~3 pages of PDF (does not include reasoning tokens). If None, don’t constrain the model’s output.
Optional timeout in seconds for a single LLM request.
A dataframe with markdown strings for each PDF file.
For Gemini models, this function uses the google file API, uploading PDF files to Google’s file store and deleting them after each request.
