Overview
The Invernaderos API uses TimescaleDB (an extension of PostgreSQL) to store and query time-series sensor data efficiently. TimescaleDB provides automatic partitioning (hypertables), compression, data retention, and continuous aggregates for real-time analytics.TimescaleDB Version: Compatible with PostgreSQL 16Schema:
iot (not public)Primary Table: sensor_readings (hypertable with 7-day chunks)Hypertable Setup
Database Configuration
TimescaleDB is configured as the primary datasource:TimescaleDataSourceConfig.kt
Sensor Readings Table
The primary entity for time-series data:SensorReading.kt
Hypertable Creation
TimescaleDB hypertable is created via SQL migration:Migration
Hypertable Settings Explained
Hypertable Settings Explained
- chunk_time_interval: 7 days per chunk (optimal for sensor data)
- Composite PK:
(time, sensor_id)allows multiple sensors per timestamp - Indexes: Optimized for queries by greenhouse, tenant, and sensor
- TIMESTAMPTZ: Timezone-aware timestamps (UTC)
Compression Policies
Enable Compression
TimescaleDB compression reduces storage by 90%+ for historical data:Compression Strategy
- Segmentby
- Orderby
sensor_id or greenhouse_id can decompress only relevant segments, not the entire chunk.Trade-off: More segments = slightly higher overhead, but much faster queries.Compression Results
Check compression statistics:Compression Ratio: Typical compression achieves 90-95% reduction for time-series sensor data with repetitive patterns.
Retention Policies
Automatic Data Deletion
Automatically delete data older than 2 years:Check Retention Policies
Manual Data Cleanup
For one-time cleanup:Continuous Aggregates
TimescaleDB continuous aggregates are materialized views that automatically refresh with new data.Hourly Aggregates
Pre-aggregate sensor data by hour:Daily Aggregates by Tenant
Querying Continuous Aggregates
Performance Gain: Continuous aggregates can be 50-100x faster than querying raw data for dashboard analytics.
Query Optimization
TimescaleDB-Specific Functions
TimescaleDB provides specialized functions for time-series queries:SensorReadingRepository.kt
TimescaleDB Functions Reference
TimescaleDB Functions Reference
time_bucket(interval, timestamp): Buckets timestamps into intervals (e.g., hourly, daily)FIRST(value, time): Returns the first value ordered by time (earliest)LAST(value, time): Returns the last value ordered by time (most recent)approx_percentile(percentile, value): Approximate percentile calculation (faster than exact)histogram(value, min, max, buckets): Creates histogram for value distribution
DISTINCT ON Optimization
Get the latest reading per sensor efficiently:PostgreSQL DISTINCT ON: This is a PostgreSQL-specific feature (not standard SQL). It returns the first row of each group, ordered by the specified columns.
Batch Inserts
Use batch inserts for high-throughput writes:Data Seeding Example
Realistic sensor data for development:seed_timescale_dev.sql
Monitoring & Maintenance
Check Hypertable Stats
View Chunk Information
Background Jobs
Best Practices
- Use 7-day chunks for sensor data (balances query performance and storage)
- Enable compression after 7 days to reduce storage by 90%+
- Create continuous aggregates for dashboard queries (hourly, daily)
- Use
time_bucket()for time-based grouping instead ofDATE_TRUNC() - Batch INSERT operations for high-throughput writes
- Index on
(tenant_id, time)for multi-tenant queries - Monitor chunk count (too many small chunks = poor performance)
For multi-tenant data isolation, see Multi-Tenant Architecture.For real-time data caching, see MQTT Integration.