Overview
ThinkParser extracts content that appears after </think> tags in model responses. It is designed for models that generate reasoning within <think>...</think> tags but do not automatically remove them from the output.
ThinkParser is intended for use with models which always include <think>...</think> tags but do NOT parse them automatically. This will cause parsing failures if the model does not include <think>...</think> tags, or if the chat template automatically removes <think>...</think> tags. In particular, you should NOT use this parser with Qwen3 or DeepSeek-R1 models.
Class Signature
class ThinkParser(Parser):
def __init__(self, extract_fn: Callable[[str], str] = lambda x: x)
Parameters
Optional extraction function to further process the parsed text after removing think tags. For example, you can use this to extract boxed answers from math problems.
Methods
parse
Extracts content after the last </think> tag.
def parse(self, text: str) -> str
The text to parse, potentially containing <think>...</think> tags
Returns: The content after the last </think> tag, or an empty string if no </think> tag is found. The result is then passed through extract_fn if provided.
Behavior:
- If
</think> is found: Returns everything after the last </think> tag
- If
</think> is NOT found: Returns an empty string (strict enforcement)
- Whitespace is automatically stripped
Returns a reward function that validates the think tag format in completions.
def get_format_reward_func(self) -> Callable
Returns: A reward function that checks if each assistant message follows the correct format:
- Must start with
<think>
- Must contain exactly one
<think> tag
- Must contain exactly one
</think> tag
- Must have non-empty content after
</think>
The reward function returns the average score across all assistant messages (1.0 for well-formatted, 0.0 for poorly-formatted).
Usage Examples
Basic Usage
import verifiers as vf
# Create parser
parser = vf.ThinkParser()
# Parse text with think tags
text = """<think>
Let me think about this problem.
I need to consider multiple factors.
</think>
The final answer is 42."""
result = parser.parse(text)
print(result) # "The final answer is 42."
import re
import verifiers as vf
def extract_boxed(text: str) -> str:
"""Extract content from \boxed{...}"""
match = re.search(r'\\boxed\{([^}]+)\}', text)
return match.group(1) if match else text
# Create parser with extraction
parser = vf.ThinkParser(extract_fn=extract_boxed)
text = """<think>
I need to solve this step by step.
</think>
The answer is \\boxed{42}."""
result = parser.parse(text)
print(result) # "42"
import verifiers as vf
parser = vf.ThinkParser()
reward_func = parser.get_format_reward_func()
# Well-formatted completion
completion = [
{"role": "assistant", "content": "<think>Let me think</think>Final answer"}
]
reward = reward_func(completion)
print(reward) # 1.0
# Poorly-formatted completion (missing think tags)
bad_completion = [
{"role": "assistant", "content": "Just an answer without thinking"}
]
reward = reward_func(bad_completion)
print(reward) # 0.0
In a Rubric
import verifiers as vf
def check_correctness(completion, answer, **kwargs):
"""Check if the parsed answer is correct"""
return 1.0 if completion == answer else 0.0
parser = vf.ThinkParser()
rubric = vf.Rubric(
funcs=[check_correctness, parser.get_format_reward_func()],
weights=[1.0, 0.1], # Correctness weighted more than format
parser=parser
)
Multiple Think Blocks
When multiple think blocks are present, only content after the last </think> tag is returned:
import verifiers as vf
parser = vf.ThinkParser()
text = """<think>First thought</think>
Some intermediate text.
<think>Second thought</think>
Final answer here."""
result = parser.parse(text)
print(result) # "Final answer here."
When to Use ThinkParser
Use ThinkParser when:
- Your model always generates
<think>...</think> tags for reasoning
- The model does NOT automatically strip these tags from responses
- You want strict enforcement (fail if tags are missing)
- You need to validate that responses follow the think tag format
Do NOT use ThinkParser with:
- Qwen3 models (automatically parse think tags)
- DeepSeek-R1 models (automatically parse think tags)
- Models that may or may not include think tags (use MaybeThinkParser instead)
- Non-reasoning models that never use think tags
See Also