semantic.with_cluster_labels() and semantic.reduce() to automatically cluster customer feedback into themes and generate intelligent summaries for each discovered category.
Overview
Customer feedback analysis is a critical business process that traditionally requires manual categorization. This example shows how semantic clustering can automatically:- Discover hidden themes in unstructured feedback without predefined categories
- Group similar feedback based on semantic meaning rather than keywords
- Generate actionable insights for each theme using AI-powered summarization
- Prioritize issues based on sentiment and frequency
Key Features
Semantic Clustering
Using
semantic.with_cluster_labels() for embedding-based clustering.AI Summarization
Using
semantic.reduce() for intelligent theme analysis.Automatic Theme Discovery
No manual categorization required - themes emerge from data.
Business Intelligence
Actionable insights for product teams with priority rankings.
How It Works
Implementation
Session Configuration
This example requires both a language model (for summarization) and an embedding model (for clustering).
Sample Data
Create Embeddings
Cluster and Summarize
semantic.reduce() is used as an aggregation function to summarize multiple feedback entries into a coherent theme description.Sample Results
The system automatically discovered these themes from 12 feedback entries:Cluster 0: Positive Features & Support (4.75★)
Theme: Praise for specific features and excellent customer support Key Points:- Dark mode feature highly appreciated
- Helpful support team
- Effective search functionality
Cluster 1: UI/UX Design Issues (2.0★)
Theme: Design consistency and professional appearance concerns Key Points:- Inconsistent button layouts across screens
- Unprofessional appearance
Cluster 2: Technical Performance Problems (1.75★)
Theme: Critical technical issues affecting core functionality Key Points:- App crashes during photo uploads
- Slow loading times (30+ seconds)
- Frequent freezes
Cluster 3: Usability & Feature Gaps (2.0★)
Theme: Process complexity and missing functionality Key Points:- Confusing checkout process
- Need for offline mode
- Excel export capability requested
Business Value
Automated Insights
No Manual Work
Identifies themes without manual categorization.
Consistent Analysis
Provides uniform analysis across all feedback.
Scales Effortlessly
Handles thousands of feedback entries.
Actionable Intelligence
Priority 1: Fix technical crashes and performance (Cluster 2)- Highest impact on user satisfaction
- Critical functionality issues
- Lowest average rating (1.75★)
- Affects professional brand perception
- Relatively quick wins
- Add requested features (offline mode, Excel export)
- Reduce checkout complexity
- Highest satisfaction area (4.75★)
- Model for other areas
Resource Optimization
- Reduces manual analysis time from hours to minutes
- Enables real-time feedback monitoring
- Focuses development efforts on highest-impact issues
Use Cases
- Product Development
- Customer Success
- Marketing Intelligence
- Identify most requested features
- Understand user pain points
- Prioritize bug fixes and improvements
Key Operations
semantic.with_cluster_labels()
- Uses K-means clustering on embedding vectors
- Assigns
cluster_labelto each row - Returns DataFrame with added cluster column
semantic.reduce()
- Aggregation function that summarizes multiple texts
- Uses LLM to analyze and synthesize insights
- Generates human-readable theme descriptions
Running the Example
Expected Output
The script displays:- Raw Feedback Data: Customer names, feedback text, and ratings
- Clustering Progress: Embedding generation and clustering status
- Theme Analysis: Detailed summaries for each discovered cluster
- Business Insights: Actionable themes ranked by priority
Advanced Usage
Adjusting Cluster Count
Custom Summarization Prompts
Learning Outcomes
This example teaches:- How to combine embedding-based clustering with AI summarization
- When to use semantic operations for business intelligence
- Patterns for automated text analysis and insight generation
- Integration of multiple semantic operations in data pipelines
