Semantic joins enable you to join DataFrames using natural language predicates that are evaluated by language models. Unlike traditional joins that require exact matches or embedding-based similarity joins, semantic joins can understand complex relationships and make intelligent connections based on meaning and context.
import fenic as fcconfig = fc.SessionConfig( app_name="semantic_joins", semantic=fc.SemanticConfig( language_models={ "mini": fc.OpenAILanguageModel( model_name="gpt-4o-mini", rpm=500, tpm=200_000, ) } ),)session = fc.Session.get_or_create(config)# User profilesusers_data = [ { "user_id": "user_001", "name": "Sarah", "interests": "I love cooking Italian food and trying new pasta recipes" }, { "user_id": "user_002", "name": "Mike", "interests": "I enjoy working on cars and fixing engines in my spare time" }, { "user_id": "user_003", "name": "Emily", "interests": "Gardening is my passion, especially growing vegetables and flowers" }, { "user_id": "user_004", "name": "David", "interests": "I'm interested in learning about car maintenance and automotive repair" }]# Available articlesarticles_data = [ { "article_id": "art_001", "title": "Cooking Pasta Recipes", "description": "Delicious pasta recipes including spaghetti carbonara and fettuccine alfredo" }, { "article_id": "art_002", "title": "Car Engine Maintenance", "description": "Essential guide to automobile engine care and troubleshooting" }, { "article_id": "art_003", "title": "Gardening for Beginners", "description": "Start your garden with basic techniques for growing vegetables and flowers" }, { "article_id": "art_004", "title": "Advanced Automotive Repair", "description": "Comprehensive automotive repair instructions for experienced mechanics" }]users_df = session.create_dataframe(users_data)articles_df = session.create_dataframe(articles_data)
# Use semantic join to match users with articles based on their interestsuser_article_matches = users_df.semantic.join( articles_df, predicate=( "A person with interests '{{left_on}}' would be interested in reading about '{{right_on}}'" ), left_on=fc.col("interests"), right_on=fc.col("description"))print("User-Article Matches:")user_article_matches.select( "name", "interests", "title", "description").show()
Notice how both Mike and David matched with automotive content, even though their interests are expressed differently. The LLM understands the semantic relationship.
# Use semantic join for product recommendationsrecommendations = purchases_df.semantic.join( products_df, predicate=( "A customer who bought '{{left_on}}' would also be interested in '{{right_on}}'" ), left_on=fc.col("purchased_product"), right_on=fc.col("product_name"))print("Product Recommendations:")recommendations.select( "customer_name", "purchased_product", "product_name", "category").show()
How to construct effective natural language join predicates
When semantic joins are preferable to traditional or similarity-based joins
Practical applications in recommendation systems and personalization
Understanding the trade-offs between accuracy, performance, and cost
Semantic joins are perfect for scenarios where the relationship between data is conceptual rather than exact, and where human-like reasoning is needed to determine matches.