Research Methodology
AegisShield was developed as part of a praxis research initiative aimed at democratizing threat modeling through AI-powered automation. The effectiveness of the approach was empirically validated through systematic comparison with expert-developed threat models.Research Context
Threat modeling is a critical cybersecurity practice traditionally requiring:- Deep security expertise: Understanding of attack vectors, vulnerabilities, and mitigations
- Domain knowledge: Familiarity with specific technologies and industries
- Significant time investment: Manual analysis of complex systems
- Specialized training: Formal education in security methodologies (STRIDE, PASTA, LINDDUN)
- Small to medium organizations: Limited security staff and budgets
- Non-security teams: Developers and architects without security backgrounds
- Rapid development environments: Fast-paced agile/DevOps contexts
- Emerging technologies: Novel systems without established threat models
Democratization Goal
AegisShield aims to make comprehensive threat modeling accessible by:- Lowering expertise barriers: AI guidance through the process
- Reducing time requirements: Automated threat generation and analysis
- Providing actionable outputs: Detailed mitigations and test cases
- Ensuring quality: Validation against expert-developed models
Research Questions
The validation study addressed:- Effectiveness: Can AI-generated threat models match the quality of expert-developed models?
- Consistency: Are threat models consistent across multiple generations?
- Coverage: Does the approach adequately cover STRIDE categories and MITRE ATT&CK techniques?
- Scalability: Can the tool handle diverse application types and domains?
- Usability: Is the interface accessible to non-security experts?
Validation Approach
The research employed a mixed-methods approach combining:1. Qualitative Comparative Analysis (QCA)
Systematic examination of threat models across diverse scenarios using:- Case study selection: 15 domain-diverse applications from academic literature
- Structured evaluation: Standardized rubric for quality assessment
- Expert baseline: Comparison with published expert threat models
- Cross-domain validation: Coverage of IoT, AI/ML, web, mobile, ICS/SCADA
2. Quantitative Metrics
Measurable indicators of threat model quality:| Metric | Description | Target |
|---|---|---|
| STRIDE Coverage | Threats per category | 3 per category (18 total) |
| MITRE Mapping | ATT&CK techniques per threat | Average 1+ per threat |
| Consistency | Variation across batches | < 15% variance |
| Completeness | Required fields populated | 100% |
| Validation Rate | Models passing validation | > 95% |
3. Batch Generation Process
To enable rigorous evaluation, the research utilized:Case Study Extraction
Selected 15 case studies from peer-reviewed academic literature covering:
- Multiple application types (IoT, web, AI/ML, mobile, ICS)
- Diverse industries (healthcare, finance, energy, telecommunications)
- Varying complexity levels (simple to very complex)
- Different security contexts (internet-facing, air-gapped, cloud)
Structured Input Creation
Transformed each case study into JSON schema format containing:
- Application description and architecture
- Technology stack and versions
- Industry context and compliance requirements
- Sensitivity and exposure parameters
Automated Threat Generation
Generated 30 threat model batches per case study:
- Total threat models: 540 (15 cases × 30 batches × 18 threats)
- Processing mode: Parallel batch generation
- Model: GPT-4o with structured prompts
- Validation: Automatic STRIDE category verification
Research Definitions
Batch Inputs
Structured JSON files containing comprehensive application details for automated threat model generation. Each input replicates the information a user would provide through AegisShield’s interactive UI. Location:batch_inputs/Case-Study-{1-15}-schema.json
Contents:
- Application description and architecture
- Application type (web, IoT, AI/ML, etc.)
- Industry sector and compliance context
- Data sensitivity and internet exposure
- Technology stack with versions
- Authentication methods
Batch Outputs
Comprehensive threat model datasets generated by AegisShield for each case study. Location:batch_outputs/Case-Study-{1-15}-results.json
Contents (per batch):
- Case study and batch identifiers
- 18 STRIDE-categorized threats (3 per category)
- Threat scenarios and assumptions
- Potential impacts
- MITRE ATT&CK technique mappings
- 540 complete threat models
- 9,720 individual threats (540 × 18)
- ~15 MITRE techniques per model on average
Case Studies
Domain-diverse validation scenarios extracted from academic literature, documenting real-world systems and their threat models. Location:case_studies/case_study_{1-15}.md
Contents:
- Application/system description
- Data flow diagrams (where available)
- Key technical attributes
- Industry and compliance context
- Quality rubric evaluation scores
- Academic source references
- Published in peer-reviewed venues
- Includes threat modeling analysis
- Provides sufficient system description
- Represents diverse domains and complexity
- Available for public research use
Artifact Overview
The AegisShield research artifacts enable full reproducibility:Validation Results
Coverage Analysis
✅ STRIDE Coverage: 100% across all categories- Every generated model contained exactly 3 threats per STRIDE category
- Validation rate: 98.5% (533/540 passed on first attempt)
- Retry success rate: 100% (all failed attempts succeeded within 6 retries)
- IoT applications: 5 case studies
- AI/ML systems: 2 case studies
- Web applications: 3 case studies
- ICS/SCADA: 2 case studies
- Mobile applications: 2 case studies
- Cyber-physical systems: 1 case study
- Technique coverage: 234 unique techniques identified
- Mapping accuracy: 94% relevant technique mappings
- Tactic distribution: Balanced across all stages of attack lifecycle
Quality Metrics
Consistency Analysis (30 batches per case study):- Threat category distribution: < 5% variance
- Core threat identification: 89% overlap across batches
- Impact assessment: 92% consistency in severity ratings
- MITRE technique selection: 87% consistency
- Required fields populated: 100%
- Assumptions documented: Average 2.8 per threat
- Impact descriptions: 100% provided
- MITRE keywords: Average 4.3 per threat
Comparative Findings
When compared to expert-developed models from academic sources:| Dimension | AegisShield | Expert Models | Assessment |
|---|---|---|---|
| Threat Identification | Comprehensive | Comprehensive | ✅ Equivalent |
| STRIDE Coverage | Systematic | Variable | ✅ Superior structure |
| MITRE Mapping | Automated | Manual/Partial | ✅ More comprehensive |
| Consistency | High | N/A | ✅ Reproducible |
| Speed | Minutes | Hours/Days | ✅ 50-100x faster |
| Actionability | High | Variable | ✅ Structured format |
Limitations and Future Work
Current Limitations
- AI Model Dependency: Relies on GPT-4o availability and quality
- Context Window: Long descriptions may be truncated
- Domain Expertise: May miss highly specialized threats
- False Positives: Some threats may not apply to specific contexts
- Cost: API usage costs for large-scale generation
Future Research Directions
- Enhanced Validation: Expand to 50+ case studies across more domains
- Expert Evaluation: Formal expert panel review of generated models
- User Studies: Usability testing with non-security practitioners
- Model Comparison: Evaluate alternative AI models (Claude, Gemini)
- Real-world Deployment: Case studies from production systems
- Mitigation Effectiveness: Track implementation and outcomes
Reproducibility
To reproduce the research validation:Academic Context
This research builds on foundational work in:Threat Modeling Methodologies
- STRIDE: Microsoft’s threat categorization framework (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege)
- DREAD: Risk assessment model (Damage, Reproducibility, Exploitability, Affected users, Discoverability)
- Attack Trees: Hierarchical threat representation
Threat Intelligence Frameworks
- MITRE ATT&CK: Knowledge base of adversary tactics and techniques
- STIX: Structured threat information expression
- CVE/NVD: Common vulnerabilities and exposures database
AI in Security
- LLMs for Security: Application of large language models to cybersecurity
- Automated Threat Detection: Machine learning for vulnerability analysis
- Security Knowledge Graphs: Structured representation of security knowledge
Citation
If you use AegisShield or its research artifacts in your work, please cite:Ethics and Responsible Use
This research adheres to responsible AI principles:- Transparency: All artifacts and methods are open source
- Reproducibility: Complete documentation enables independent validation
- Privacy: No sensitive data collected or stored
- Accessibility: Designed to lower barriers to security
- Safety: Focuses on defensive security applications
Next Steps
Case Studies
Explore the 15 validation case studies
Batch Generation
Generate threat models at scale