Scoring
Scores start at 100 and points are deducted per finding, with per-severity caps to prevent one category from dominating.| Severity | Points per finding | Cap |
|---|---|---|
| CRITICAL | −15 | −60 |
| HIGH | −8 | −40 |
| MEDIUM | −3 | −20 |
| LOW | −1 | −10 |
Only vulnerability findings affect the score. Recommendations are displayed in the report but do not lower your governance score.
Rule categories
| Category | Rules | What it catches |
|---|---|---|
| Security | SEC-001 → SEC-011 | Hardcoded keys, prompt injection, code execution, filesystem/network access |
| Governance | GOV-001 → GOV-011 | No audit logging, no HITL, self-modifying prompts, no fallback |
| Compliance | COM-001 → COM-005 | EU AI Act Art. 9, 11, 12, 14 gaps |
| Determinism | DET-001 → DET-007 | Temperature not set, no timeout, no retry, no iteration limit, no seed |
| Vendor Concentration | VCR-001 → VCR-003 | Same vendor across model + framework + cloud + governance layers |
| Framework-Specific | FW-001 → FW-010 | CrewAI delegation risks, AutoGen code exec defaults, LangGraph state issues |
| Operational Boundaries | ODD-001 → ODD-004 | No boundary definition, unrestricted tools, no spend cap |
| Magnitude | MAG-001 → MAG-003 | No spend cap, no rate limit, unclassified data access |
| Identity | ID-001 → ID-003 | Static credentials, shared credentials, no identity |
| Multi-Agent | MULTI-001 → MULTI-004 | No topology, circular deps, no conflict protection |
| Hooks | HOOK-001 → HOOK-003 | No pre-action validation, no session-end gate |
| Versioning | CV-001 → CV-002 | No policy versioning, no audit policy reference |
| FinOps | FIN-001 → FIN-003 | No cost tracking, single model for all tasks, no cache |
| Resilience | RES-001 → RES-002 | No fallback for critical ops, no state preservation |
| A2A | A2A-001 → A2A-003 | No A2A auth, unvalidated inter-agent input |
| Best Practices | BP-001 → BP-005 | Outdated framework, no tests, too many tools |
Category notes
Vendor Concentration (VCR)
Vendor Concentration (VCR)
These rules detect when your model, framework, and governance stack all come from the same vendor — flagging audit independence risk that vendor-affiliated tools have no incentive to report. For example, using OpenAI models, the OpenAI Agents SDK, and Azure-hosted infrastructure creates a concentration risk that generic security tools won’t flag.
Framework-Specific (FW)
Framework-Specific (FW)
These rules detect known governance gaps in the frameworks you use, including default configurations that ship in an insecure state:
Rules are only triggered when the relevant framework is detected in your project.
| Framework | Rules | What’s flagged |
|---|---|---|
| CrewAI | FW-001 → FW-003 | Unsafe code execution, memory isolation, delegation risks |
| LangGraph | FW-004 → FW-005 | Unrestricted ToolNode, no checkpointing |
| AutoGen | FW-006 → FW-007 | LocalCommandLineCodeExecutor, no output validation |
| Semantic Kernel | FW-008 → FW-009 | Auto-imported plugins, no cost guard |
| PydanticAI | FW-010 | Untyped tool returns |
Determinism (DET)
Determinism (DET)
The Determinism score is calculated independently from the Governance score, using only DET-* findings. This gives engineering teams a focused view of behavioral reproducibility separate from security posture.Common determinism findings include: LLM temperature not set, no timeout configured, no retry logic, no iteration limit, and no seed parameter.