Skip to main content
Drako’s testing utilities let you run governed agents in automated test environments without any backend connection and without HITL checkpoints blocking execution. All operations are fully offline.
from drako import test_mode

with test_mode():
    crew = govern(crew)
    result = crew.kickoff()  # HITL won't block

test_mode()

A context manager that patches DrakoClient in place so all policy evaluations, audit logs, hook executions, and chain verifications run offline. It also sets DRAKO_TEST_MODE=true in the environment for the duration of the block.

Signature

from drako import test_mode

with test_mode(
    hitl="auto-approve",
    dlp="audit",
    enforcement="audit",
):
    ...

Parameters

hitl
str | MockHITLResolver
default:"auto-approve"
Controls how HITL checkpoints are resolved inside the context:
  • "auto-approve" — all tools are immediately approved.
  • "auto-deny" — all tools are immediately rejected.
  • "skip" — HITL evaluation is bypassed entirely; all tools are allowed.
  • MockHITLResolver instance — per-tool rules (see below).
Any other string value raises ValueError.
dlp
str
default:"audit"
Data loss prevention behaviour inside the context. Accepted values: "audit" or "off".
enforcement
str
default:"audit"
Policy enforcement level inside the context. Accepted values: "audit" or "off".

How it works

test_mode() applies the following unittest.mock.patch replacements for the duration of the with block:
Method patchedTest behaviour
DrakoClient.evaluate_policyReturns an offline decision based on hitl.
DrakoClient.evaluate_policy_syncSync version of the above.
DrakoClient.audit_logNo-op, returns {}.
DrakoClient.audit_log_syncNo-op, returns {}.
DrakoClient.execute_hooksNo-op, returns {}.
DrakoClient.execute_hooks_syncNo-op, returns {}.
DrakoClient.verify_chainNo-op, returns {}.
DrakoClient.verify_chain_syncNo-op, returns {}.
All patches are cleanly reversed when the context exits, even if an exception is raised. Environment variables set by the context are also restored.

Policy evaluation response shape

Inside test_mode(), evaluate_policy and evaluate_policy_sync return a dict with the following structure:
{
    "request_id": "<uuid>",
    "agent_id": "<agent_id>",
    "decision": "allowed" | "rejected",
    "trust_score": 0.85,
    "reasoning": ["test_mode: ..."],
    "policy_version": "test",
    "audit_hash": "0000...0000",
    "evaluated_at": <unix_timestamp>,
}

Examples

from drako import govern, test_mode

with test_mode():
    result = governed_agent.run(task)

MockHITLResolver

A pluggable resolver that lets you define per-tool approval rules for use inside test_mode(). Useful when you want most tools approved but specific dangerous tools denied.

Signature

from drako import MockHITLResolver

resolver = MockHITLResolver(
    default_action="approve",
    rules={
        "delete_records": "deny",
        "transfer_funds": "deny",
    },
)

Parameters

default_action
str
default:"approve"
Default decision for any tool not listed in rules. Must be "approve" or "deny". Any other value raises ValueError.
rules
dict[str, str] | None
Mapping of tool name to action ("approve" or "deny"). Tools listed here override default_action.

How it works

When test_mode(hitl=resolver) is active, every call to evaluate_policy / evaluate_policy_sync resolves through MockHITLResolver.resolve():
  1. Looks up tool_name in rules.
  2. Falls back to default_action if no rule matches.
  3. Appends the call details to resolver.call_log for later inspection.
  4. Returns {"decision": "allowed"} for "approve" and {"decision": "rejected"} for "deny".

resolve() method

resolver.resolve(
    tool_name: str,
    agent_id: str = "",
    context: dict | None = None,
) -> dict
tool_name
str
required
Name of the tool being evaluated.
agent_id
str
DID or name of the acting agent. Recorded in call_log.
context
dict | None
Optional context dict (not used by the resolver itself).
Returns {"decision": "allowed" | "rejected", "reason": "test_mode:approve" | "test_mode:deny"}.

call_log

Each resolve() call appends an entry to resolver.call_log:
{
    "tool_name": "web_search",
    "agent_id": "did:mesh:ag_abc",
    "decision": "allowed",
    "resolved_at": 1711234567.123,
}
Inspect call_log in assertions to verify which tools were evaluated and in what order.

Example

from drako import govern, test_mode, MockHITLResolver

resolver = MockHITLResolver(
    default_action="approve",
    rules={
        "delete_records": "deny",
        "transfer_funds": "deny",
    },
)

with test_mode(hitl=resolver):
    crew = govern(crew)
    crew.kickoff()

# Assert that no sensitive tools were called
for entry in resolver.call_log:
    if entry["tool_name"] in ("delete_records", "transfer_funds"):
        assert entry["decision"] == "rejected", (
            f"Sensitive tool '{entry['tool_name']}' should have been denied"
        )

print(f"{len(resolver.call_log)} tool evaluations recorded.")

Helper functions

Two additional helpers are importable from drako.testing:
from drako.testing import is_test_mode, get_hitl_default
is_test_mode() -> bool — returns True when DRAKO_TEST_MODE is set to "true", "1", or "yes" in the environment. get_hitl_default() -> str — returns the value of DRAKO_HITL_DEFAULT from the environment, or "approve" if the variable is not set.

Best practices

Without test_mode(), governed agents will attempt to reach the Drako API. In a CI environment with no DRAKO_API_KEY set, this will produce warnings and fall back to ungoverned mode. Any real HITL checkpoints will block indefinitely.
# GitHub Actions example
- name: Run tests
  env:
    DRAKO_TEST_MODE: "true"
    DRAKO_HITL_DEFAULT: "approve"
  run: pytest
Or wrap every test with test_mode():
# conftest.py
import pytest
from drako import test_mode

@pytest.fixture(autouse=True)
def drako_test_mode():
    with test_mode():
        yield
Use MockHITLResolver with default_action="deny" to verify that your agent handles blocked tools gracefully, then check call_log to confirm the right tools were evaluated.
def test_agent_handles_blocked_tool():
    resolver = MockHITLResolver(default_action="deny")

    with test_mode(hitl=resolver):
        crew = govern(crew)
        result = crew.kickoff()

    # Agent should complete with blocked message, not raise
    assert "[Drako] Action blocked" in str(result) or result is not None
    assert len(resolver.call_log) > 0
To test governance.on_backend_unreachable: block without a live backend, use hitl="auto-deny" combined with assertions on the return value.
def test_dangerous_tool_is_blocked():
    resolver = MockHITLResolver(
        default_action="approve",
        rules={"exec_shell": "deny"},
    )

    with test_mode(hitl=resolver):
        crew = govern(crew)
        crew.kickoff()

    denied = [
        e for e in resolver.call_log
        if e["tool_name"] == "exec_shell"
    ]
    assert all(e["decision"] == "rejected" for e in denied)
test_mode() does not require a .drako.yaml file or an API key. You can call govern() inside a test_mode() context on any duck-typed crew/graph/chat object without configuring Drako at all.
class MinimalCrew:
    agents = []
    tasks = []
    def kickoff(self):
        return "done"

with test_mode():
    crew = govern(MinimalCrew())
    assert crew.kickoff() == "done"
Never use test_mode() outside of test code. It patches DrakoClient globally within the current process, bypassing all real governance checks.

Build docs developers (and LLMs) love