Overview
The DataAggregator class combines data from Google Analytics 4, Google Search Console, and DataForSEO into unified reports for comprehensive content analysis and opportunity identification.
Installation
from data_sources.modules.data_aggregator import DataAggregator
Authentication
The aggregator automatically initializes all data source clients using their respective environment variables. See individual module documentation for authentication setup:
Initialization
aggregator = DataAggregator()
The aggregator will attempt to initialize all three data sources. If any fail (missing credentials, etc.), it will print a warning and continue with available sources.
Methods
get_comprehensive_page_performance
Get all available data for a specific page from all sources.
performance = aggregator.get_comprehensive_page_performance(
url="/blog/podcast-monetization",
days=30
)
ISO timestamp of analysis
Google Analytics data (pageviews, trends)
Search Console data (clicks, impressions, keywords)
DataForSEO rankings for top keywords
identify_content_opportunities
Identify content opportunities across all data sources.
opportunities = aggregator.identify_content_opportunities(
days=30,
min_monthly_pageviews=100
)
Number of days to analyze
Minimum monthly pageviews threshold
Keywords ranking 11-20 (from GSC)
Pages losing traffic (from GA4)
High impressions, low CTR pages (from GSC)
Rising queries (from GSC)
Keywords competitors rank for (from DataForSEO)
Generate comprehensive performance report with summary metrics, top performers, opportunities, and recommendations.
report = aggregator.generate_performance_report(days=30)
Summary metrics from all sources
Top 10 pages by pageviews
Content opportunities by category
Actionable recommendations
get_priority_queue
Get prioritized list of content tasks.
tasks = aggregator.get_priority_queue(limit=10)
Number of tasks to return
Prioritized task list sorted by priority (high → medium → low)“high”, “medium”, or “low”
Task type (optimize, update, optimize_meta, create_new)
Why this task is recommended
Recommendation Types
The aggregator generates actionable recommendations based on identified opportunities:
Quick Wins
{
'priority': 'high',
'type': 'optimize',
'action': "Optimize for 'podcast hosting'",
'reason': "Currently ranking #12 with 1,500 impressions. Small improvements could push to page 1.",
'keyword': 'podcast hosting',
'current_position': 12
}
Declining Content
{
'priority': 'high',
'type': 'update',
'action': 'Update declining article: How to Monetize Your Podcast',
'reason': 'Traffic down 35% (1,200 → 780 pageviews). Needs refresh.',
'url': '/blog/podcast-monetization',
'change_percent': -35.0
}
Low CTR
{
'priority': 'medium',
'type': 'optimize_meta',
'action': 'Improve meta elements for: /blog/podcast-analytics',
'reason': 'Getting 2,000 impressions but only 2.1% CTR. Better title/description could add 50 clicks/month.',
'url': '/blog/podcast-analytics',
'potential_clicks': 50
}
Trending Topics
{
'priority': 'medium',
'type': 'create_new',
'action': "Create content for trending topic: 'video podcast software'",
'reason': 'Search interest up 45% with 800 recent impressions. Strike while hot!',
'query': 'video podcast software',
'growth': 45.0
}
Example Usage
from data_sources.modules.data_aggregator import DataAggregator
aggregator = DataAggregator()
# Generate full report
report = aggregator.generate_performance_report(days=30)
print(f"Report Period: Last {report['period_days']} days")
print(f"Generated: {report['generated_at']}")
# Summary
if report['summary']:
print("\n📊 SUMMARY")
print("-" * 80)
if 'total_pageviews' in report['summary']:
print(f"Total Pageviews: {report['summary']['total_pageviews']:,}")
print(f"Total Sessions: {report['summary']['total_sessions']:,}")
print(f"Avg Engagement Rate: {report['summary']['avg_engagement_rate']:.1%}")
if 'total_clicks' in report['summary']:
print(f"Total Clicks (GSC): {report['summary']['total_clicks']:,}")
print(f"Total Impressions: {report['summary']['total_impressions']:,}")
print(f"Avg CTR: {report['summary']['avg_ctr']:.2%}")
# Top performers
if report.get('top_performers'):
print("\n🏆 TOP 10 PERFORMERS")
print("-" * 80)
for i, page in enumerate(report['top_performers'][:10], 1):
print(f"{i}. {page['title']}")
print(f" {page['pageviews']:,} views | {page['engagement_rate']:.1%} engagement")
# Recommendations
if report.get('recommendations'):
print("\n✅ TOP RECOMMENDATIONS")
print("-" * 80)
for i, rec in enumerate(report['recommendations'][:5], 1):
print(f"\n{i}. [{rec['priority'].upper()}] {rec['action']}")
print(f" {rec['reason']}")
# Priority queue for task management
print("\n📋 PRIORITY QUEUE")
print("-" * 80)
tasks = aggregator.get_priority_queue(limit=10)
for i, task in enumerate(tasks, 1):
print(f"\n{i}. [{task['priority'].upper()}] {task['type']}")
print(f" {task['action']}")
print(f" {task['reason']}")
Source Code Reference
Location: data_sources/modules/data_aggregator.py:24
The aggregator automatically handles errors from individual data sources gracefully, continuing with available data if some sources fail.