Skip to main content

Hospital Data Analysis Platform

Production-oriented analytics pipeline for hospital data with hardware-aware early warning, predictive modeling, and CPU-optimized real-time inference.

Quick Start

Get up and running with the hospital data analysis pipeline in minutes.

1

Install Dependencies

Clone the repository and install required Python packages.
git clone https://github.com/RaviTejaMedarametla/data-analysis-for-hospitals.git
cd "Data Analysis for Hospitals/task"
pip install -r ../../requirements.txt
2

Generate Dataset Manifest

Create a manifest of your hospital data files to validate schema and track data versions.
python cli.py manifest
{
  "files": ["general.csv", "prenatal.csv", "sports.csv"],
  "total_records": 1500,
  "schema_version": "1.0"
}
3

Run the Analytics Pipeline

Execute the full pipeline including ingestion, feature engineering, model training, and deployment monitoring.
python cli.py run
This generates artifacts including trained models, performance benchmarks, hardware profiles, and monitoring summaries.
4

Run Hardware-Constrained Experiments

Evaluate early warning system performance under different memory, compute, and latency constraints.
python cli.py early-warning-experiment

Key Features

Comprehensive analytics capabilities designed for production deployment on CPU-constrained hardware.

CPU-Optimized Modeling

Custom logistic regression and risk stratification models designed for efficient CPU execution without GPU dependencies.

Real-time Anomaly Detection

Hardware-aware early warning system with configurable latency budgets and false-positive rate control.

Streaming Inference

Online scoring with chunk-based processing to balance throughput and latency under memory constraints.

ONNX Export

Cross-platform model serialization for deployment to diverse runtime environments.

Deterministic Reproducibility

Explicit seed management and environment controls for audit-ready, reproducible results.

Hardware Profiling

Memory and compute budget modeling with automated batch size adjustment and utilization tracking.

Explore by Topic

Deep dive into specific areas of the platform.

Core Concepts

Understand the pipeline architecture, stage boundaries, and hardware constraint modeling.

Data Pipeline

Learn about data ingestion, preprocessing, and feature engineering workflows.

Modeling

Train predictive models, stratify risk, detect anomalies, and evaluate performance.

Real-time Processing

Implement streaming inference and early warning systems with latency optimization.

Deployment

Deploy models for CPU inference, export to ONNX, and monitor production systems.

Operations

Configure the system, ensure reproducibility, run benchmarks, and troubleshoot issues.

Ready to get started?

Follow our quickstart guide to run your first analytics pipeline and generate production-ready insights from hospital data.

View Quickstart Guide

Build docs developers (and LLMs) love