Skip to main content

What is Air Quality Index (AQI)?

The Air Quality Index (AQI) is a standardized indicator used worldwide to communicate how polluted the air currently is or how polluted it is forecast to become. AQI values range from 0 to 500, where higher values indicate greater levels of air pollution and greater health concerns.

Good (0-50)

Air quality is satisfactory, and air pollution poses little or no risk.

Moderate (51-100)

Air quality is acceptable. However, there may be a risk for some people, particularly those who are unusually sensitive to air pollution.

Unhealthy for Sensitive Groups (101-150)

Members of sensitive groups may experience health effects. The general public is less likely to be affected.

Unhealthy (151-200)

Some members of the general public may experience health effects; members of sensitive groups may experience more serious health effects.

Very Unhealthy (201-300)

Health alert: The risk of health effects is increased for everyone.

Hazardous (301-500)

Health warning of emergency conditions: everyone is more likely to be affected.

Why Machine Learning for AQI Prediction?

Predicting AQI values is a complex challenge that requires analyzing multiple interrelated factors:
AQI is determined by multiple pollutants (PM2.5, PM10, NO2, O3, SO2, CO) that interact in complex, non-linear ways. Machine learning models can capture these intricate relationships better than traditional statistical methods.
Air quality has strong temporal patterns - hourly cycles, daily rhythms, weekly trends, and seasonal variations. ML models, especially sequential architectures, excel at learning these time-based dependencies.
Pollution spreads and transforms across geographic regions. Advanced models can incorporate spatial relationships between monitoring stations and environmental features.
Weather conditions (wind, temperature, humidity, pressure) dramatically affect pollutant dispersion and formation. ML models can learn complex meteorological impacts automatically from data.

The AQI Predictor Approach

AQI Predictor is designed as a flexible, scalable system for training and deploying machine learning models that forecast AQI values hours to days in advance.

Core Principles

Data-Driven

Learn patterns directly from historical air quality and environmental data rather than relying solely on physical models.

Multivariate

Incorporate multiple pollutants, meteorological variables, and temporal features simultaneously.

Flexible Architecture

Support multiple model types (LSTM, GRU, Transformer, CNN-LSTM hybrids) to find the best fit for your data.

Production-Ready

Built with deployment in mind, featuring model versioning, monitoring, and scalable inference.

Prediction Workflow

Key Terminology

Familiarize yourself with these terms as they appear throughout the documentation.
TermDefinition
PollutantA substance in the air that can be harmful to health or the environment
PM2.5Particulate Matter with diameter < 2.5 micrometers
PM10Particulate Matter with diameter < 10 micrometers
Time HorizonHow far into the future the model predicts (e.g., 1-hour, 24-hour)
Lookback WindowAmount of historical data used as input (e.g., past 24 hours)
Feature VectorNumerical representation of all input variables at a given time
Sequence ModelML model designed to process temporal sequences (LSTM, GRU, Transformer)
InferenceUsing a trained model to make predictions on new data

Understanding the Challenge

Time Horizon: 1-6 hours aheadShort-term predictions rely heavily on recent pollutant concentrations and immediate meteorological conditions. These are generally more accurate as the air quality state has high autocorrelation over short periods.Key Factors:
  • Recent pollutant trends
  • Current wind patterns
  • Local emission sources
  • Temperature and humidity
Next Steps: Explore Data Sources to understand what data you need, or dive into Model Architecture to learn about the ML approaches available.

Build docs developers (and LLMs) love