Detect editorial bias and analyze news articles using semantic classification and extraction
A comprehensive demonstration of Fenic’s semantic classification capabilities for detecting editorial bias and analyzing news articles across multiple sources.
Stage 1 - Information Extraction: Uses semantic.extract() with Pydantic models to identify bias indicators, emotional language, and opinion markers.Stage 2 - Grounded Classification: Uses extracted information as context for semantic.classify() to achieve more accurate political bias detection.
# Prepare article attributes for profilingresults_df = results_df.with_column("article_attributes", fc.text.jinja( ( "Primary Topics: {{primary_topic}}\n" "Detected Political Bias: {{content_bias}}\n" "Detected Bias Indicators: {{bias_indicators}}\n" "Opinion Indicators: {{opinion_markers}}\n" "Emotional Language: {{emotional_language}}\n" "Journalistic Style: {{journalistic_style}}" ), primary_topic=fc.col("primary_topic"), content_bias=fc.col("content_bias"), bias_indicators=fc.col("bias_indicators"), opinion_markers=fc.col("opinion_markers"), emotional_language=fc.col("emotional_language"), journalistic_style=fc.col("journalistic_style")))# Generate semantic summaries for each sourcesource_language_profiles = results_df.group_by("source").agg( fc.semantic.reduce( """ You are given a set of article analyses from {{news_outlet}}. Create a concise (3-5 sentence) media profile for {{news_outlet}}. Summarize the information provided without explicitly referencing it. """, column=fc.col("article_attributes"), group_context={ "news_outlet": fc.col("source"), }, max_output_tokens=1024, ).alias("source_profile"),).select(fc.col("source"), fc.col("source_profile")).cache()print("AI-Generated Media Profiles:")source_language_profiles.show()
The Balanced Tribune presents a diverse range of topics, primarily focusing on business, technology, climate, and healthcare. It exhibits a right-leaning bias in its business and technology coverage, emphasizing themes like Wall Street stability and American free enterprise, while adopting a far-left perspective on climate issues, critiquing fossil fuel companies. The publication often employs sensationalist and informational journalistic styles, utilizing emotional language to evoke strong reactions.
# Set your API key (OpenAI, Google, or Anthropic)export OPENAI_API_KEY="your-api-key-here"# export GOOGLE_API_KEY="your-api-key-here"# export ANTHROPIC_API_KEY="your-api-key-here"python news_analysis.py
Shows how to improve classification accuracy by first extracting relevant information with semantic.extract(), then using that context for more informed semantic.classify() operations.