PipelineConfig
Unified configuration for end-to-end training pipelines: pretrain → SFT → DPO → verifier. Combines model, training, hardware, and data configs into a single JSON-serializable structure for orchestration. Each stage has its own training config, but they share the same model architecture and hardware settings.Model architecture parameters
Size of the vocabulary.
Hidden dimension of the model.
Number of transformer layers.
Number of attention heads.
Hidden size of the feedforward layer.
Maximum sequence length.
Dropout probability.
Whether to use Rotary Position Embeddings.
Whether to use attention sinks.
Number of attention sink tokens.
Whether to use SwiGLU activation.
Whether to share input/output embeddings.
Whether to use Grouped Query Attention.
Number of groups for GQA.
Whether to use Mixture-of-Experts.
Hardware configuration
Hardware preset: “auto”, “local”, “rtx3060”, “a100”, or “h100”.
Data configuration
Data scale preset: “small”, “medium”, “large”, or “xl”.
List of dataset names for pretraining. If None, uses default from data preset.
Pretraining parameters
Maximum training steps for pretraining.
Learning rate for pretraining.
Global batch size for pretraining.
Micro batch size for pretraining.
Warmup steps for pretraining.
SFT parameters
Maximum training steps for supervised fine-tuning.
Learning rate for SFT.
Global batch size for SFT.
Micro batch size for SFT.
Dataset for SFT (used if sft_datasets is None).
List of multiple SFT datasets. Overrides sft_dataset if provided.
DPO parameters
Maximum training steps for Direct Preference Optimization.
Learning rate for DPO.
Global batch size for DPO.
Micro batch size for DPO.
Beta parameter for DPO loss.
Dataset for DPO.
Verifier parameters
Maximum training steps for verifier.
Learning rate for verifier.
Global batch size for verifier.
Micro batch size for verifier.
Output and logging
Base directory for all outputs.
Base name for the run. Stage suffixes are added automatically.
Tokenizer to use across all stages.
Random seed for reproducibility.
Mixed precision dtype: “bf16”, “fp16”, or “fp32”.
Evaluate every N steps.
Save checkpoint every N steps.
Log metrics every N steps.
Methods
get_model_config
Build aModernLLMConfig from pipeline settings.
get_hardware_config
Get hardware config from preset or auto-detect.get_data_config
Get data config from preset.get_pretrain_config
BuildTrainingConfig for pretraining stage.
get_sft_config
BuildTrainingConfig for SFT stage.
get_dpo_config
BuildTrainingConfig for DPO stage.
get_verifier_config
BuildTrainingConfig for verifier training.
save
Save config to JSON file.load
Load config from JSON file.to_dict
Serialize to dictionary.from_dict
Create config from dictionary.Preset configurations
local_smoke_config
Minimal config for quick smoke testing on local machine.- d_model: 256
- n_layers: 4
- n_heads: 4
- ffn_hidden_size: 512
- max_seq_len: 256
- hardware_preset: “local”
- data_preset: “small”
- pretrain_max_steps: 100
- sft_max_steps: 50
- dpo_max_steps: 50
- verifier_max_steps: 50
local_full_config
Full config for RTX 3060 training.- d_model: 768
- n_layers: 12
- n_heads: 12
- ffn_hidden_size: 3072
- max_seq_len: 1024
- hardware_preset: “local”
- data_preset: “medium”
- pretrain_max_steps: 20000
- sft_max_steps: 5000
- dpo_max_steps: 2000
- verifier_max_steps: 3000
gpu_smoke_config
Minimal config for GPU smoke testing.- Same as
local_smoke_configbut withhardware_preset="auto"
gpu_full_config
Full config for high-end GPU training (A100/H100).- d_model: 1024
- n_layers: 12
- n_heads: 16
- ffn_hidden_size: 4096
- max_seq_len: 1024
- use_attention_sinks: False (for Flash Attention compatibility)
- hardware_preset: “auto”
- data_preset: “large”
- pretrain_datasets: [“wikitext-103-raw-v1”, “openwebtext”, “wikipedia”, “roneneldan/TinyStories:100000”]
- pretrain_max_steps: 80000
- pretrain_batch_size: 128
- pretrain_micro_batch_size: 32
- sft_datasets: [“tatsu-lab/alpaca”, “databricks/databricks-dolly-15k”, “Open-Orca/OpenOrca:50000”]
- sft_max_steps: 10000
- dpo_max_steps: 3000
- verifier_max_steps: 3000
get_pipeline_preset
Get a pipeline preset by name.name: One of “local-smoke”, “local”, “gpu-smoke”, or “gpu”