Skip to main content

VerifierConfig

Configuration dataclass for the verifier encoder model.

Constructor

from modern_llm.models.verifier import VerifierConfig

config = VerifierConfig(
    vocab_size=50257,
    d_model=512,
    num_layers=4,
    n_heads=8,
    max_position_embeddings=512,
    num_classes=2
)
vocab_size
int
required
Size of the vocabulary. Must be positive.
d_model
int
default:"512"
Model hidden dimension. Must be divisible by n_heads.
num_layers
int
default:"4"
Number of transformer encoder layers. Must be positive.
n_heads
int
default:"8"
Number of attention heads. Must divide d_model evenly.
max_position_embeddings
int
default:"512"
Maximum sequence length supported by positional embeddings.
num_classes
int
default:"2"
Number of output classes. Must be >= 2. Default is 2 for binary classification (incorrect/correct).
dropout
float
default:"0.1"
Dropout probability applied throughout the model.

VerifierModel

Encoder-only transformer model for verifying solution correctness. The verifier is designed for scoring generated solutions in math, reasoning, or QA tasks. It follows the approach of training lightweight judges (Cobbe et al., 2021; Lightman et al., 2023) using a transformer encoder to predict correctness logits.

Architecture

h₀ = token_embed(input_ids) + positional_embed
h_L = TransformerEncoder(h₀)
logits = classifier(pool(h_L))
The model uses:
  • Learned token and positional embeddings
  • Standard transformer encoder layers with GELU activation
  • Mean pooling over non-padded tokens
  • Linear classifier for final predictions

Constructor

from modern_llm.models.verifier import VerifierModel, VerifierConfig

config = VerifierConfig(
    vocab_size=50257,
    d_model=512,
    num_layers=4,
    n_heads=8
)
verifier = VerifierModel(config)
config
VerifierConfig
required
Verifier configuration containing all hyperparameters.

Attributes

config
VerifierConfig
The configuration object passed during initialization.
token_embed
nn.Embedding
Token embedding layer of shape (vocab_size, d_model).
position_embed
nn.Embedding
Positional embedding layer of shape (max_position_embeddings, d_model).
encoder
nn.TransformerEncoder
Stack of num_layers transformer encoder layers.
classifier
nn.Linear
Classification head: d_model -> num_classes.

forward

def forward(
    self,
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    labels: Optional[Tensor] = None,
) -> dict[str, Tensor]
Compute verifier predictions and optional loss.
input_ids
Tensor
required
Input token IDs of shape (batch, seq_len).
attention_mask
Optional[Tensor]
Attention mask of shape (batch, seq_len) with 1 for real tokens and 0 for padding.
labels
Optional[Tensor]
Ground truth class labels of shape (batch,) with values in range [0, num_classes). When provided, loss is computed.

Returns

logits
Tensor
Classification logits of shape (batch, num_classes).
loss
Optional[Tensor]
Cross-entropy loss (scalar) when labels are provided.

score

def score(
    self,
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None
) -> Tensor
Compute probability of correctness without gradients.
input_ids
Tensor
required
Input token IDs of shape (batch, seq_len).
attention_mask
Optional[Tensor]
Attention mask of shape (batch, seq_len).

Returns

scores
Tensor
Correctness probabilities of shape (batch,) with values in [0, 1]. Returns P(class=1) for binary classification.

predict

def predict(
    self,
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None
) -> Tensor
Predict class labels without gradients.
input_ids
Tensor
required
Input token IDs of shape (batch, seq_len).
attention_mask
Optional[Tensor]
Attention mask of shape (batch, seq_len).

Returns

predictions
Tensor
Predicted class indices of shape (batch,). For binary classification: 0=incorrect, 1=correct.

Example

import torch
from modern_llm.models.verifier import VerifierModel, VerifierConfig

# Initialize verifier
config = VerifierConfig(
    vocab_size=50257,
    d_model=512,
    num_layers=4,
    n_heads=8,
    dropout=0.1
)
verifier = VerifierModel(config)

# Training: compute loss
input_ids = torch.randint(0, config.vocab_size, (4, 128))
labels = torch.randint(0, 2, (4,))  # Binary labels
outputs = verifier(input_ids, labels=labels)
loss = outputs["loss"]
logits = outputs["logits"]

# Inference: get predictions
verifier.eval()
input_ids = torch.randint(0, config.vocab_size, (8, 128))
predictions = verifier.predict(input_ids)
print(predictions)  # Tensor of 0s and 1s

# Score solutions
scores = verifier.score(input_ids)
print(scores)  # Probabilities in [0, 1]

Training workflow

import torch
import torch.optim as optim
from modern_llm.models.verifier import VerifierModel, VerifierConfig

# Setup
config = VerifierConfig(vocab_size=50257)
verifier = VerifierModel(config)
optimizer = optim.AdamW(verifier.parameters(), lr=1e-4)

# Training step
verifier.train()
for batch in dataloader:
    input_ids = batch["input_ids"]
    attention_mask = batch["attention_mask"]
    labels = batch["labels"]  # 0=incorrect, 1=correct
    
    outputs = verifier(input_ids, attention_mask, labels)
    loss = outputs["loss"]
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

# Evaluation
verifier.eval()
with torch.no_grad():
    for batch in val_loader:
        predictions = verifier.predict(
            batch["input_ids"],
            batch["attention_mask"]
        )
        accuracy = (predictions == batch["labels"]).float().mean()

Use cases

The verifier model is useful for:
  • Ranking multiple generated solutions by correctness probability
  • Process reward modeling for reinforcement learning
  • Filtering low-quality generations before human review
  • Active learning by selecting uncertain predictions
# Rank candidate solutions
candidates = [tokenize(solution) for solution in solutions]
input_ids = torch.stack(candidates)
scores = verifier.score(input_ids)
best_idx = scores.argmax()
best_solution = solutions[best_idx]

Complexity

O(num_layers · seq_len² · d_model / n_heads) per forward pass.

Build docs developers (and LLMs) love