ThinkParser

Overview

ThinkParser extracts content that appears after </think> tags in model responses. It is designed for models that generate reasoning within <think>...</think> tags but do not automatically remove them from the output.

ThinkParser is intended for use with models which always include <think>...</think> tags but do NOT parse them automatically. This will cause parsing failures if the model does not include <think>...</think> tags, or if the chat template automatically removes <think>...</think> tags. In particular, you should NOT use this parser with Qwen3 or DeepSeek-R1 models.

Class Signature

class ThinkParser(Parser):
    def __init__(self, extract_fn: Callable[[str], str] = lambda x: x)

Parameters

extract_fn

Callable[[str], str]

default:"lambda x: x"

Optional extraction function to further process the parsed text after removing think tags. For example, you can use this to extract boxed answers from math problems.

Methods

parse

Extracts content after the last </think> tag.

def parse(self, text: str) -> str

text

str

required

The text to parse, potentially containing <think>...</think> tags

Returns: The content after the last </think> tag, or an empty string if no </think> tag is found. The result is then passed through extract_fn if provided. Behavior:

If </think> is found: Returns everything after the last </think> tag
If </think> is NOT found: Returns an empty string (strict enforcement)
Whitespace is automatically stripped

get_format_reward_func

Returns a reward function that validates the think tag format in completions.

def get_format_reward_func(self) -> Callable

Returns: A reward function that checks if each assistant message follows the correct format:

Must start with <think>
Must contain exactly one <think> tag
Must contain exactly one </think> tag
Must have non-empty content after </think>

The reward function returns the average score across all assistant messages (1.0 for well-formatted, 0.0 for poorly-formatted).

Usage Examples

Basic Usage

import verifiers as vf

# Create parser
parser = vf.ThinkParser()

# Parse text with think tags
text = """<think>
Let me think about this problem.
I need to consider multiple factors.
</think>
The final answer is 42."""

result = parser.parse(text)
print(result)  # "The final answer is 42."

With Custom Extraction Function

import re
import verifiers as vf

def extract_boxed(text: str) -> str:
    """Extract content from \boxed{...}"""
    match = re.search(r'\\boxed\{([^}]+)\}', text)
    return match.group(1) if match else text

# Create parser with extraction
parser = vf.ThinkParser(extract_fn=extract_boxed)

text = """<think>
I need to solve this step by step.
</think>
The answer is \\boxed{42}."""

result = parser.parse(text)
print(result)  # "42"

Using Format Reward Function

import verifiers as vf

parser = vf.ThinkParser()
reward_func = parser.get_format_reward_func()

# Well-formatted completion
completion = [
    {"role": "assistant", "content": "<think>Let me think</think>Final answer"}
]
reward = reward_func(completion)
print(reward)  # 1.0

# Poorly-formatted completion (missing think tags)
bad_completion = [
    {"role": "assistant", "content": "Just an answer without thinking"}
]
reward = reward_func(bad_completion)
print(reward)  # 0.0

In a Rubric

import verifiers as vf

def check_correctness(completion, answer, **kwargs):
    """Check if the parsed answer is correct"""
    return 1.0 if completion == answer else 0.0

parser = vf.ThinkParser()

rubric = vf.Rubric(
    funcs=[check_correctness, parser.get_format_reward_func()],
    weights=[1.0, 0.1],  # Correctness weighted more than format
    parser=parser
)

Multiple Think Blocks

When multiple think blocks are present, only content after the last </think> tag is returned:

import verifiers as vf

parser = vf.ThinkParser()

text = """<think>First thought</think>
Some intermediate text.
<think>Second thought</think>
Final answer here."""

result = parser.parse(text)
print(result)  # "Final answer here."

When to Use ThinkParser

Use ThinkParser when:

Your model always generates <think>...</think> tags for reasoning
The model does NOT automatically strip these tags from responses
You want strict enforcement (fail if tags are missing)
You need to validate that responses follow the think tag format

Do NOT use ThinkParser with:

Qwen3 models (automatically parse think tags)
DeepSeek-R1 models (automatically parse think tags)
Models that may or may not include think tags (use MaybeThinkParser instead)
Non-reasoning models that never use think tags

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

Overview

Class Signature

Parameters

Methods

parse

get_format_reward_func

Usage Examples

Basic Usage

With Custom Extraction Function

Using Format Reward Function

In a Rubric

Multiple Think Blocks

When to Use ThinkParser

See Also

Build docs developers (and LLMs) love

Environment Classes

Rubrics & Parsers

Clients

Integration Classes

Experimental

Data Types

Utilities

​Overview

​Class Signature

​Parameters

​Methods

​parse

​get_format_reward_func

​Usage Examples

​Basic Usage

​With Custom Extraction Function

​Using Format Reward Function

​In a Rubric

​Multiple Think Blocks

​When to Use ThinkParser

​See Also

Build docs developers (and LLMs) love

Overview

Class Signature

Parameters

Methods

parse

get_format_reward_func

Usage Examples

Basic Usage

With Custom Extraction Function

Using Format Reward Function

In a Rubric

Multiple Think Blocks

When to Use ThinkParser

See Also