How Scanning Works
Yggdrasil uses a deterministic rule engine — no ML models in the enforcement loop. This makes results reproducible and audit-ready.Trigger the scan
Click “Run Scan” after confirming your column mapping. The API creates a scan record with status
running.Rule execution
The engine evaluates all active rules against your dataset:
- Single-transaction rules: Evaluated per row
- Windowed rules: Grouped by account, evaluated within time windows
Violation detection
When a record matches a rule’s conditions:
- A violation is created with evidence (matched field values)
- Confidence score is calculated (rule quality + specificity + Bayesian precision)
- Explanation is generated from deterministic templates (no LLM calls)
Results storage
All violations are persisted to the database with:
- Policy excerpt violated
- Evidence grid (field values from your data)
- Explanation (human-readable condition summary)
- Severity (CRITICAL, HIGH, MEDIUM)
Scan Status Polling
Scans run synchronously but may take a few seconds for large datasets. Poll the status endpoint:Scans typically complete in under 5 seconds for datasets up to 50K rows.
Rule Types
The engine routes rules by type:Single-Transaction Rules
Evaluated per row. Example: “Transaction amount > $10,000”Windowed Rules
Grouped by account, evaluated within time windows. Example: “3+ transactions within 24 hours”aggregation: Sum/count aggregates (e.g., total transaction value)velocity: Transaction frequency limitsstructuring: Splitting transactions to avoid thresholdsdormant_reactivation: Inactive accounts suddenly activeround_amount: Suspicious even-number patterns
Condition Evaluation
Rules use compound boolean logic:AND Conditions
All conditions must match:OR Conditions
Any condition must match:Nested Logic
Combine AND/OR for complex rules:Supported Operators
| Operator | Aliases | Description |
|---|---|---|
>= | greater_than_or_equal, gte | Numeric comparison |
> | greater_than, gt | Numeric comparison |
<= | less_than_or_equal, lte | Numeric comparison |
< | less_than, lt | Numeric comparison |
== | equals, eq | Equality with type coercion |
!= | not_equals, neq | Inequality |
IN | — | Set membership |
BETWEEN | — | Range check [min, max] |
exists | — | Field present and non-empty |
not_exists | — | Field missing or empty |
contains | includes | Case-insensitive substring |
MATCH | regex | Regular expression test |
Violation Capping
To prevent overwhelming results:- Each rule is capped at 1,000 violations
- The true count is tracked separately in
violation_count - Compliance score uses the true count, not the capped stored count
Performance
Scan execution time:| Dataset Size | Typical Duration |
|---|---|
| 1,000 rows | < 1 second |
| 10,000 rows | 1-2 seconds |
| 50,000 rows | 3-5 seconds |
- In-memory execution (no database queries during evaluation)
- Concurrent batch writes for violations (2,500 rows per batch)
- Metadata caching for statistical anomaly detection
What Gets Stored
After the scan completes:Scan Record
- Status:
completed - Violation count (true count, pre-cap)
- Compliance score (0-100)
- Score history (initial entry)
- Mapping configuration (for auditability)
Violations
- Rule ID and name
- Severity
- Record ID (from your CSV)
- Evidence (matched field values)
- Policy excerpt violated
- Deterministic explanation
- Status:
pending(awaiting review)
PII Findings
If PII detection was run, findings are linked to the scan viascan_id.
Next Steps
After the scan completes:- Review violations by severity → Violation Review
- Approve or dismiss findings to improve rule precision
- Generate AI remediation steps → Remediation
- Export the full compliance report