Project Structure

Proper project organization is essential for maintainable, reproducible ML training workflows. This guide covers Python packaging, ML-specific project templates, and best practices.

Python Package Structure

The reference implementations follow standard Python packaging conventions:

classic-example/
├── classic_example/          # Main package
│   ├── __init__.py
│   ├── cli.py               # Command-line interface
│   ├── config.py            # Configuration dataclasses
│   ├── data.py              # Data loading utilities
│   ├── train.py             # Training logic
│   ├── predictor.py         # Inference code
│   └── utils.py             # Helper functions
├── tests/                   # Test suite
│   ├── test_code.py
│   ├── test_data.py
│   └── test_model.py
├── conf/                    # Configuration files
│   ├── example.json
│   └── fast.json
├── Dockerfile               # Container definition
├── Makefile                 # Build automation
├── requirements.txt         # Dependencies
└── README.md

Classic Example Package

The classic_example package demonstrates BERT-based training:

Module Organization

cli.py
config.py
data.py
utils.py

Command-line interface using Typer:

import typer
from classic_example.train import train
from classic_example.data import load_sst2_data
from classic_example.utils import upload_to_registry, load_from_registry

app = typer.Typer()
app.command()(train)
app.command()(load_sst2_data)
app.command()(upload_to_registry)
app.command()(load_from_registry)

if __name__ == "__main__":
    app()

Configuration dataclasses:

from dataclasses import dataclass
from typing import Optional

@dataclass
class DataTrainingArguments:
    train_file: str
    validation_file: str
    max_seq_length: int = 128
    overwrite_cache: bool = False
    pad_to_max_length: bool = True
    max_train_samples: Optional[int] = None
    max_eval_samples: Optional[int] = None

@dataclass
class ModelArguments:
    model_name_or_path: str
    config_name: Optional[str] = None
    tokenizer_name: Optional[str] = None
    cache_dir: Optional[str] = None
    use_fast_tokenizer: bool = True
    use_wandb: bool = False
    save_model: bool = False

Data loading utilities:

from pathlib import Path
from datasets import load_dataset
from sklearn.model_selection import train_test_split

def load_sst2_data(path_to_save: Path):
    """Load SST-2 sentiment analysis dataset."""
    path_to_save.mkdir(parents=True, exist_ok=True)
    
    dataset = load_dataset("glue", "sst2")
    df_train, df_val = train_test_split(
        dataset["train"].to_pandas(), 
        random_state=42
    )
    
    df_train.to_csv(path_to_save / "train.csv", index=False)
    df_val.to_csv(path_to_save / "val.csv", index=False)

Helper functions:

import wandb
from pathlib import Path
from sklearn.metrics import f1_score, fbeta_score

def compute_metrics(p: EvalPrediction) -> Dict[str, float]:
    """Compute F1 and F0.5 scores."""
    preds = np.argmax(p.predictions, axis=1)
    return {
        "f1": f1_score(y_true=p.label_ids, y_pred=preds),
        "f0.5": fbeta_score(y_true=p.label_ids, y_pred=preds, beta=0.5),
    }

def upload_to_registry(model_name: str, model_path: Path):
    """Upload model artifacts to W&B registry."""
    with wandb.init() as _:
        art = wandb.Artifact(model_name, type="model")
        art.add_file(model_path / "config.json")
        art.add_file(model_path / "model.safetensors")
        art.add_file(model_path / "tokenizer.json")
        wandb.log_artifact(art)

Generative Example Structure

The generative_example package follows a similar structure for LLM training:

generative-example/
├── generative_example/
│   ├── __init__.py
│   ├── cli.py               # CLI with Typer
│   ├── config.py            # LoRA and training configs
│   ├── data.py              # Dataset preparation
│   ├── train.py             # SFT training with LoRA
│   ├── predictor.py         # Inference wrapper
│   └── utils.py
├── tests/
├── conf/
│   └── example.json         # Phi-3 training config
└── requirements.txt

LLM-Specific Configuration

@dataclass
class ModelArguments:
    model_id: str                    # HuggingFace model ID
    lora_r: int                      # LoRA rank
    lora_alpha: int                  # LoRA alpha
    lora_dropout: float              # LoRA dropout rate

@dataclass
class DataTrainingArguments:
    train_file: str                  # JSONL training data
    test_file: str                   # JSONL test data

ML Project Templates

Lightning-Hydra Template

Production-ready template with PyTorch Lightning and Hydra configuration

Sample Python Module

Minimal example of Python project structure

Build Automation

Use Makefiles to standardize common tasks:

build:
	docker build -f Dockerfile -t classic-example:latest .

run_dev: build
	docker run -it -v ${PWD}:/main classic-example:latest /bin/bash

format:
	ruff format classic_example/ tests/

lint:
	ruff check classic_example/ tests/

test:
	pytest --disable-warnings ./tests/

train_example:
	python classic_example/cli.py load-sst2-data ./data
	python classic_example/cli.py train ./conf/example.json
	python classic_example/cli.py upload-to-registry example_model ./results

Code Style

Use Ruff for fast linting and formatting:

# Format code
ruff format classic_example/ tests/

# Check for issues
ruff check classic_example/ tests/

Ruff is 10-100x faster than Black and Flake8 while providing equivalent functionality.

Docker Integration

Containerize training workflows for reproducibility:

FROM python:3.10-slim

WORKDIR /main
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
ENV PYTHONPATH=/main

CMD ["python", "classic_example/cli.py", "train", "./conf/example.json"]

Best Practices

Separate Configuration from Code

Use JSON or YAML files for hyperparameters and paths. This enables:

Easy experimentation without code changes
Version control for configurations
Reproducible runs from config files

Modular Components

Split training logic into reusable modules:

data.py: Dataset loading and preprocessing
train.py: Training loop and checkpointing
predictor.py: Inference and evaluation
utils.py: Shared helper functions

Command-Line Interface

Use Typer or Click to create user-friendly CLIs:

Type-safe argument parsing
Automatic help documentation
Easy integration with scripts and CI/CD

Testing Infrastructure

Include tests for all components:

test_code.py: Unit tests for functions
test_data.py: Data validation tests
test_model.py: Model behavior tests

Resources

Python Project Structure

Comprehensive guide to structuring Python projects

Deep Learning Projects

Best practices for organizing ML projects

README Driven Development

Write documentation before implementation

The Twelve Factors

Principles for building production applications

Next Steps

Configuration Management

Learn how to manage training configurations and track experiments

Module 1: Infrastructure

Module 2: Data Management

Module 3: Training Workflows

Module 4: Pipeline Orchestration

Module 5: Model Serving

Module 6: Optimization

Module 7: Monitoring

Module 8: Cloud Platforms

Project Structure

Project Structure

Python Package Structure

Classic Example Package

Module Organization

Generative Example Structure

LLM-Specific Configuration

ML Project Templates

Lightning-Hydra Template

Sample Python Module

Build Automation

Code Style

Docker Integration

Best Practices

Resources

Python Project Structure

Deep Learning Projects

README Driven Development

The Twelve Factors

Next Steps

Configuration Management

Build docs developers (and LLMs) love

Module 1: Infrastructure

Module 2: Data Management

Module 3: Training Workflows

Module 4: Pipeline Orchestration

Module 5: Model Serving

Module 6: Optimization

Module 7: Monitoring

Module 8: Cloud Platforms

​Project Structure

​Python Package Structure

​Classic Example Package

​Module Organization

​Generative Example Structure

​LLM-Specific Configuration

​ML Project Templates

Lightning-Hydra Template

Sample Python Module

​Build Automation

​Code Style

​Docker Integration

​Best Practices

​Resources

Python Project Structure

Deep Learning Projects

README Driven Development

The Twelve Factors

​Next Steps

Configuration Management

Build docs developers (and LLMs) love

Project Structure

Python Package Structure

Classic Example Package

Module Organization

Generative Example Structure

LLM-Specific Configuration

ML Project Templates

Build Automation

Code Style

Docker Integration

Best Practices

Resources

Next Steps