Overview
The sampling module provides a standalone script for generating text from trained GPT models. It supports loading models from checkpoints or using pretrained GPT-2 variants, with full control over generation parameters.Basic usage
Configuration parameters
Model initialization
Model source:
'resume': Load from checkpoint inout_dir'gpt2','gpt2-medium','gpt2-large','gpt2-xl': Use pretrained GPT-2 models
Directory containing checkpoint file (only used when
init_from='resume')Generation settings
Prompt text to condition generation on. Can be:
- Direct text:
"Once upon a time" - Special token:
"<|endoftext|>" - File reference:
"FILE:prompt.txt"(reads prompt from file)
Number of independent samples to generate
Maximum number of tokens to generate for each sample
Sampling parameters
Controls randomness in sampling:
1.0: No change to the probability distribution< 1.0: Less random, more confident predictions (sharper distribution)> 1.0: More random, more diverse predictions (flatter distribution)- Approaching
0.0: Deterministic (argmax)
Restricts sampling to the top-k most likely tokens. Tokens outside the top-k have their probability set to zero. Set to
None to disable.System settings
Random seed for reproducible generation
Device to run inference on:
'cpu', 'cuda', 'cuda:0', 'cuda:1', 'mps', etc.Data type for inference:
'float32', 'bfloat16', or 'float16'. Auto-selects bfloat16 if GPU supports it.Use PyTorch 2.0 compilation for faster inference (requires PyTorch >= 2.0)
Text encoding/decoding
The script automatically handles text encoding:From checkpoint
When loading from a trained checkpoint, it looks formeta.pkl in the dataset directory:
From GPT-2 pretrained
When using GPT-2 variants, it uses tiktoken for GPT-2 BPE encoding:The special token
<|endoftext|> is recognized by the GPT-2 tokenizer and can be used to signal document boundaries.Generation flow
The complete generation process:1. Model setup
2. Prompt encoding
3. Generation loop
Examples
Creative story generation
Code completion
Deterministic generation
Batch prompts from file
Createprompts.txt:
Performance considerations
Troubleshooting
Programmatic usage
You can also use the generation functionality directly in your code:For more control over generation, you can implement your own sampling loop using the model’s
forward() method instead of generate().