Overview
Aiven for Metrics is built on Thanos, an open-source project that extends Prometheus with unlimited storage capabilities and global query views across multiple Prometheus instances. Store unlimited metrics for any duration with cost-effective object storage.Why Choose Aiven for Metrics
Unlimited Retention
Store metrics data for as long as needed with scalable object storage
Prometheus Compatible
Use existing Prometheus exporters, queries (PromQL), and tools like Grafana
Global Query View
Query metrics from multiple Prometheus servers through unified interface
Cost-Effective
Downsampling and compaction reduce storage costs while improving query performance
Key Components
Aiven for Metrics includes several Thanos components working together:Thanos Metrics Receiver
Thanos Metrics Receiver
Ingests metrics into the system:
- Accepts Prometheus remote write requests
- Real-time metrics collection
- High-throughput ingestion
- Automatic scaling
- Data validation
Thanos Metrics Query
Thanos Metrics Query
Query interface for metrics:
- PromQL query support
- Aggregates data from multiple sources
- Real-time and historical data
- Deduplication of samples
- Compatible with Grafana
Thanos Metrics Store
Thanos Metrics Store
Long-term storage interface:
- Interfaces with object storage
- Historical data access
- Efficient data retrieval
- Scalable storage
- Automatic data management
Thanos Metrics Compact
Thanos Metrics Compact
Storage optimization:
- Data compaction
- Downsampling for efficiency
- Reduces storage costs
- Improves query performance
- Background processing
Thanos Query Frontend
Thanos Query Frontend
Query optimization layer:
- Caches query results
- Splits large queries
- Load distribution
- Improved performance
- Reduced latency
Getting Started
Configure Prometheus Remote Write
Point your Prometheus instances to Aiven for Metrics:Get the remote write URL:Configure Prometheus:
Integrate with Grafana
Connect Grafana to query metrics:Or add manually in Grafana:
- Type: Prometheus
- URL:
https://thanos-service.aivencloud.com:443 - Auth: Basic auth with service credentials
Architecture and Data Flow
Query Processing
- Query Frontend receives requests
- Distributes to Thanos Query
- Query fetches from Receivers (recent data) and Store (historical data)
- Deduplicates samples from multiple sources
- Returns results with caching
Query Examples
- Basic Queries
- Aggregations
- Time-Based Queries
- Alerts
Benefits of Aiven for Metrics
Centralized Monitoring
Query and analyze metrics from multiple Prometheus servers and clusters through unified view
Unlimited Retention
Store unlimited metric data for any duration with scalable object storage
Cost-Effective
Downsampling and compacting reduces storage needs and costs while improving query performance
Simplified Operations
Pre-configured Thanos setup eliminates complexity of managing metrics infrastructure
High Availability
Distributed architecture ensures metrics availability and query reliability
Grafana Compatible
Seamlessly integrate with Grafana for visualization and dashboards
Downsampling
Automatic downsampling reduces storage and improves query performance:- Raw Data
- 5-Minute Resolution
- 1-Hour Resolution
Retention: Recent data
- Original resolution (15s, 30s, 1m)
- Used for recent time ranges
- Highest accuracy
- Larger storage footprint
Use Cases
- Multi-Cluster Monitoring
- Long-Term Storage
- Multi-Region Monitoring
- Cost Optimization
Monitor metrics across multiple Kubernetes clusters:
- Central metrics aggregation
- Cross-cluster queries
- Unified alerting
- Global service health
Configuration Examples
Prometheus Remote Write
Grafana Data Source
Limitations
Best Practices
Metrics Retention
Metrics Retention
- Configure appropriate retention in Prometheus
- Use external labels for multi-cluster identification
- Clean up unused metrics regularly
- Monitor ingestion rate
Query Optimization
Query Optimization
- Use appropriate time ranges
- Leverage downsampled data for long ranges
- Use recording rules for expensive queries
- Add filters early in queries
Cost Management
Cost Management
- Remove unused metrics at source
- Use relabel configs to drop metrics
- Monitor storage growth
- Leverage downsampling
Monitoring
Key Metrics to Track
- Ingestion Rate: Samples per second
- Query Latency: P50, P95, P99 query times
- Storage Usage: Object storage consumption
- TSDB Blocks: Number and size of blocks
- Query Cache: Hit rate and efficiency
Integration with Grafana
Related Services
Grafana
Visualize metrics with dashboards
Apache Kafka
Monitor Kafka metrics with Prometheus
PostgreSQL
Track database metrics over time
ClickHouse
Store metrics in ClickHouse for analysis
Resources
Prometheus Compatibility: Aiven for Metrics is fully compatible with Prometheus, allowing you to use existing exporters, queries, and tools like Grafana seamlessly.