Testing

Drako’s testing utilities let you run governed agents in automated test environments without any backend connection and without HITL checkpoints blocking execution. All operations are fully offline.

from drako import test_mode

with test_mode():
    crew = govern(crew)
    result = crew.kickoff()  # HITL won't block

test_mode()

A context manager that patches DrakoClient in place so all policy evaluations, audit logs, hook executions, and chain verifications run offline. It also sets DRAKO_TEST_MODE=true in the environment for the duration of the block.

Signature

from drako import test_mode

with test_mode(
    hitl="auto-approve",
    dlp="audit",
    enforcement="audit",
):
    ...

Parameters

hitl

str | MockHITLResolver

default:"auto-approve"

Controls how HITL checkpoints are resolved inside the context:

"auto-approve" — all tools are immediately approved.
"auto-deny" — all tools are immediately rejected.
"skip" — HITL evaluation is bypassed entirely; all tools are allowed.
MockHITLResolver instance — per-tool rules (see below).

Any other string value raises ValueError.

dlp

str

default:"audit"

Data loss prevention behaviour inside the context. Accepted values: "audit" or "off".

enforcement

str

default:"audit"

Policy enforcement level inside the context. Accepted values: "audit" or "off".

How it works

test_mode() applies the following unittest.mock.patch replacements for the duration of the with block:

Method patched	Test behaviour
`DrakoClient.evaluate_policy`	Returns an offline decision based on `hitl`.
`DrakoClient.evaluate_policy_sync`	Sync version of the above.
`DrakoClient.audit_log`	No-op, returns `{}`.
`DrakoClient.audit_log_sync`	No-op, returns `{}`.
`DrakoClient.execute_hooks`	No-op, returns `{}`.
`DrakoClient.execute_hooks_sync`	No-op, returns `{}`.
`DrakoClient.verify_chain`	No-op, returns `{}`.
`DrakoClient.verify_chain_sync`	No-op, returns `{}`.

All patches are cleanly reversed when the context exits, even if an exception is raised. Environment variables set by the context are also restored.

Policy evaluation response shape

Inside test_mode(), evaluate_policy and evaluate_policy_sync return a dict with the following structure:

{
    "request_id": "<uuid>",
    "agent_id": "<agent_id>",
    "decision": "allowed" | "rejected",
    "trust_score": 0.85,
    "reasoning": ["test_mode: ..."],
    "policy_version": "test",
    "audit_hash": "0000...0000",
    "evaluated_at": <unix_timestamp>,
}

Examples

from drako import govern, test_mode

with test_mode():
    result = governed_agent.run(task)

MockHITLResolver

A pluggable resolver that lets you define per-tool approval rules for use inside test_mode(). Useful when you want most tools approved but specific dangerous tools denied.

Signature

from drako import MockHITLResolver

resolver = MockHITLResolver(
    default_action="approve",
    rules={
        "delete_records": "deny",
        "transfer_funds": "deny",
    },
)

Parameters

default_action

str

default:"approve"

Default decision for any tool not listed in rules. Must be "approve" or "deny". Any other value raises ValueError.

rules

dict[str, str] | None

Mapping of tool name to action ("approve" or "deny"). Tools listed here override default_action.

How it works

When test_mode(hitl=resolver) is active, every call to evaluate_policy / evaluate_policy_sync resolves through MockHITLResolver.resolve():

Looks up tool_name in rules.
Falls back to default_action if no rule matches.
Appends the call details to resolver.call_log for later inspection.
Returns {"decision": "allowed"} for "approve" and {"decision": "rejected"} for "deny".

resolve() method

resolver.resolve(
    tool_name: str,
    agent_id: str = "",
    context: dict | None = None,
) -> dict

tool_name

str

required

Name of the tool being evaluated.

agent_id

str

DID or name of the acting agent. Recorded in call_log.

context

dict | None

Optional context dict (not used by the resolver itself).

Returns {"decision": "allowed" | "rejected", "reason": "test_mode:approve" | "test_mode:deny"}.

call_log

Each resolve() call appends an entry to resolver.call_log:

{
    "tool_name": "web_search",
    "agent_id": "did:mesh:ag_abc",
    "decision": "allowed",
    "resolved_at": 1711234567.123,
}

Inspect call_log in assertions to verify which tools were evaluated and in what order.

Example

from drako import govern, test_mode, MockHITLResolver

resolver = MockHITLResolver(
    default_action="approve",
    rules={
        "delete_records": "deny",
        "transfer_funds": "deny",
    },
)

with test_mode(hitl=resolver):
    crew = govern(crew)
    crew.kickoff()

# Assert that no sensitive tools were called
for entry in resolver.call_log:
    if entry["tool_name"] in ("delete_records", "transfer_funds"):
        assert entry["decision"] == "rejected", (
            f"Sensitive tool '{entry['tool_name']}' should have been denied"
        )

print(f"{len(resolver.call_log)} tool evaluations recorded.")

Helper functions

Two additional helpers are importable from drako.testing:

from drako.testing import is_test_mode, get_hitl_default

is_test_mode() -> bool — returns True when DRAKO_TEST_MODE is set to "true", "1", or "yes" in the environment. get_hitl_default() -> str — returns the value of DRAKO_HITL_DEFAULT from the environment, or "approve" if the variable is not set.

Best practices

Always use test_mode() in CI

Without test_mode(), governed agents will attempt to reach the Drako API. In a CI environment with no DRAKO_API_KEY set, this will produce warnings and fall back to ungoverned mode. Any real HITL checkpoints will block indefinitely.

# GitHub Actions example
- name: Run tests
  env:
    DRAKO_TEST_MODE: "true"
    DRAKO_HITL_DEFAULT: "approve"
  run: pytest

Or wrap every test with test_mode():

# conftest.py
import pytest
from drako import test_mode

@pytest.fixture(autouse=True)
def drako_test_mode():
    with test_mode():
        yield

Use MockHITLResolver to test policy enforcement

Use MockHITLResolver with default_action="deny" to verify that your agent handles blocked tools gracefully, then check call_log to confirm the right tools were evaluated.

def test_agent_handles_blocked_tool():
    resolver = MockHITLResolver(default_action="deny")

    with test_mode(hitl=resolver):
        crew = govern(crew)
        result = crew.kickoff()

    # Agent should complete with blocked message, not raise
    assert "[Drako] Action blocked" in str(result) or result is not None
    assert len(resolver.call_log) > 0

Test fail-closed behaviour

To test governance.on_backend_unreachable: block without a live backend, use hitl="auto-deny" combined with assertions on the return value.

def test_dangerous_tool_is_blocked():
    resolver = MockHITLResolver(
        default_action="approve",
        rules={"exec_shell": "deny"},
    )

    with test_mode(hitl=resolver):
        crew = govern(crew)
        crew.kickoff()

    denied = [
        e for e in resolver.call_log
        if e["tool_name"] == "exec_shell"
    ]
    assert all(e["decision"] == "rejected" for e in denied)

Keep test fixtures lightweight

test_mode() does not require a .drako.yaml file or an API key. You can call govern() inside a test_mode() context on any duck-typed crew/graph/chat object without configuring Drako at all.

class MinimalCrew:
    agents = []
    tasks = []
    def kickoff(self):
        return "done"

with test_mode():
    crew = govern(MinimalCrew())
    assert crew.kickoff() == "done"

Never use test_mode() outside of test code. It patches DrakoClient globally within the current process, bypassing all real governance checks.

CLI Reference

Python SDK

Rules & Advisories

test_mode()

Signature

Parameters

How it works

Policy evaluation response shape

Examples

MockHITLResolver

Signature

Parameters

How it works

resolve() method

call_log

Example

Helper functions

Best practices

Build docs developers (and LLMs) love

CLI Reference

Python SDK

Rules & Advisories

​test_mode()

​Signature

​Parameters

​How it works

​Policy evaluation response shape

​Examples

​MockHITLResolver

​Signature

​Parameters

​How it works

​resolve() method

​call_log

​Example

​Helper functions

​Best practices

Build docs developers (and LLMs) love

test_mode()

Signature

Parameters

How it works

Policy evaluation response shape

Examples

MockHITLResolver

Signature

Parameters

How it works

resolve() method

call_log

Example

Helper functions

Best practices