Yellow Taxi NYC Data Analytics
Process and analyze NYC Yellow Taxi trip data with comprehensive metrics generation and multi-format export capabilities.Quick Start
Get up and running in minutes with your first data analysis
Installation Guide
Set up your Python environment and install dependencies
Data Model
Understand the NYC taxi trip data structure
API Reference
Explore the complete YellowTaxiData API
Key Features
Automated Import
Download and process parquet files directly from NYC TLC cloud storage
Data Cleaning
Comprehensive validation with business rules for trip duration, distance, and speed
Weekly Metrics
Statistical analysis including min, max, mean for time, distance, and amount
Monthly Metrics
Aggregated metrics by rate code type (Regular, JFK, Other) and day type
Multi-Format Export
Export results to CSV and Excel with formatted sheets
High Performance
Optimized processing with pandas and pyarrow for millions of records
What You Can Do
Import NYC Yellow Taxi Data
Automatically download trip data from the NYC TLC cloud for any date range in 2022
Clean and Validate
Apply business rules to filter invalid trips based on duration, speed, distance, and amounts
Use Cases
- Trip Analysis: Analyze taxi trip patterns, durations, and distances across different time periods
- Revenue Insights: Track total amounts, variation percentages, and service counts by week
- Rate Code Comparison: Compare Regular vs JFK vs Other rate codes for weekday/weekend patterns
- Data Quality: Clean and validate large datasets with customizable business rules
- Reporting: Generate formatted reports in CSV and Excel for stakeholder consumption