System Overview
The OSS forensics system consists of:- oss-forensics-agent: Main orchestrator (not yet documented - uses investigator agents)
- oss-investigator-gh-archive-agent: Queries GH Archive via BigQuery for tamper-proof events
- oss-investigator-github-agent: Queries GitHub API and recovers “deleted” commits
- oss-investigator-wayback-agent: Recovers deleted content via Wayback Machine
- oss-investigator-local-git-agent: Analyzes cloned repos for dangling commits
- oss-investigator-ioc-extractor-agent: Extracts IOCs from vendor security reports
- oss-hypothesis-former-agent: Forms evidence-backed hypotheses
- oss-evidence-verifier-agent: Verifies all evidence against original sources
- oss-hypothesis-checker-agent: Validates hypothesis claims
- oss-report-generator-agent: Produces final forensic report
Invocation
System Architecture
Evidence Sources
The system collects forensic evidence from:GH Archive (BigQuery)
GH Archive (BigQuery)
Tamper-proof event history
- PushEvents (commits pushed)
- PullRequestEvents (PRs opened/closed/merged)
- IssuesEvents (issues opened/closed)
- CreateEvent/DeleteEvent (branches/tags created/deleted)
- WorkflowRunEvent (GitHub Actions runs)
GitHub API
GitHub API
Live repository state
- Commits (including “deleted” ones via direct SHA access)
- Pull requests
- Issues
- Forks
Wayback Machine
Wayback Machine
Archived web snapshots
- Deleted repository pages
- Deleted issues/PRs
- Historical file content
Local Git Forensics
Local Git Forensics
Dangling commits and reflog
- Unreachable commits (force-pushed history)
- Reflog analysis
- Author/committer mismatches
Vendor Security Reports
Vendor Security Reports
Indicators of Compromise (IOCs)
- Commit SHAs
- Repository names
- Usernames
- Email addresses
- File paths
- URLs, IPs, domains
Investigator Agents
oss-investigator-gh-archive-agent
Specialty: GH Archive BigQuery queries for tamper-proof forensic evidence Workflow:Construct BigQuery Queries
Based on targets (repos, actors, date ranges), build queries for relevant event types
oss-investigator-github-agent
Specialty: ALL GitHub API operations, including commit recovery via direct SHA access Key Capability: “Deleted” commits remain accessible via SHA even after force-push or branch deletion Workflow:Commits are only truly gone if the entire repo is deleted AND no public forks exist. Otherwise, they remain forensically accessible.
oss-investigator-wayback-agent
Specialty: Wayback Machine recovery for truly deleted content When to Use: Content deleted from GitHub and not accessible via API Workflow:oss-investigator-local-git-agent
Specialty: Local git repository forensics for dangling commits Workflow:oss-investigator-ioc-extractor-agent
Specialty: Extract IOCs from vendor security reports When to Run: Only when vendor report URL is provided IOC Types Extracted:| Type | Pattern Examples |
|---|---|
| COMMIT_SHA | 40-char hex: 678851bbe9776228f55e0460e66a6167ac2a1685 |
| REPOSITORY | owner/repo format |
| USERNAME | GitHub usernames |
| Email addresses in commits/reports | |
| FILE_PATH | src/malware.js |
| TAG_NAME | v1.0.0, stability |
| BRANCH_NAME | main, feature-x |
| URL | GitHub URLs, external URLs |
| IP_ADDRESS | IPv4/IPv6 addresses |
| DOMAIN | Domain names |
Analysis Agents
oss-hypothesis-former-agent
Role: ANALYST - reads evidence and forms hypotheses Does NOT collect evidence - if more evidence needed, writes evidence request file Workflow:Assess Evidence Sufficiency
Can we answer:
- Timeline: When did events occur?
- Attribution: Who did what?
- Intent: What was the goal?
- Impact: What was affected?
Request More Evidence (If Needed)
Write Orchestrator reads this and spawns appropriate investigator agent.
evidence-request-{counter}.md:oss-evidence-verifier-agent
Role: VERIFIER - verifies existing evidence against original sources Does NOT collect new evidence Workflow:- GH Archive evidence via BigQuery
- GitHub API-sourced evidence
- Wayback snapshots
- Local git commits
- Vendor IOCs
evidence-verification-report.md with verification status
oss-hypothesis-checker-agent
Role: VALIDATOR - validates hypothesis claims against verified evidence Does NOT collect new evidence or form hypotheses Workflow:Mechanical Format Check
- All claims have
[EVD-XXX]citations? - All cited evidence exists in evidence.json?
- All cited evidence is VERIFIED?
Content Validation
- Timeline chronologically consistent?
- Attribution sufficiently supported?
- No logical contradictions?
- No unsupported leaps in reasoning?
Decision
REJECT if:
- Missing citations
- Citations to non-existent evidence
- Citations to UNVERIFIED evidence
- Timeline inconsistencies
- Unsupported claims
oss-report-generator-agent
Role: REPORT GENERATOR - produces final forensic report Does NOT investigate or validate Workflow:Load Confirmed Hypothesis
hypothesis-YYY-confirmed.mdevidence.jsonevidence-verification-report.md
Output Artifacts
Requirements
- GOOGLE_APPLICATION_CREDENTIALS: For BigQuery access to GH Archive
- Git client for local repository analysis
- Internet access for GitHub API and Wayback Machine
Related Agents
OffSec Specialist
Offensive security operations and vulnerability research
Crash Analysis
Autonomous root-cause analysis for crashes