prompt() function in a dbt pipeline to transform unstructured text data (reviews, feedback) into structured data.
Overview
Theprompt() function allows you to extract structured information from text using AI, directly in your SQL queries. This is particularly useful for processing product reviews, customer feedback, or any unstructured text data.
Installation
Configuration
profiles.yml
Configure dbt to connect to MotherDuck:profiles.yml
dbt_project.yml
Configure model materialization:dbt_project.yml
Models
Extract Review Attributes
Thereviews_attributes model uses the prompt() function to extract structured data from review text:
models/reviews/reviews_attributes.sql
The
prompt() function uses the struct parameter to define the output schema and struct_descr to provide descriptions that guide the AI in extracting the right information.Aggregate Attributes by Product
Thereviews_attributes_by_product model aggregates extracted attributes:
models/reviews/reviews_attributes_by_product.sql
Calculate Sentiment Scores
Thereviews_attributes_sentiment_by_product model computes sentiment metrics:
models/reviews/reviews_attributes_sentiment_by_product.sql
How It Works
Understanding the prompt() function
Understanding the prompt() function
The
prompt() function takes three key parameters:- Prompt text: Instructions for the AI on what to extract from the text
- struct: Defines the output schema with field names and types
- struct_descr: Provides descriptions for each field to guide extraction
Data pipeline flow
Data pipeline flow
- Raw review data is sourced from a table
- The
prompt()function extracts structured attributes from each review - Results are aggregated by product to get consolidated views
- Sentiment scores are calculated for overall and service sentiment
Use Cases
- Extract sentiment from customer reviews
- Identify product features mentioned in feedback
- Analyze customer service interactions
- Track competitive mentions
- Understand purchase motivations and use cases