PDF Report Generation

AegisShield generates comprehensive PDF reports that compile all threat modeling artifacts into a single, professional document suitable for distribution to stakeholders, auditors, and development teams.

Overview

The PDF generation module (tabs/step7_generate_pdf.py) uses markdown2 for HTML conversion and xhtml2pdf (ReportLab) for PDF rendering.

20-40 Pages

Comprehensive threat model documentation

10 Sections

From executive summary to test cases

Professional Format

Styled tables, code blocks, and diagrams

Report Structure

The generated PDF includes these sections:

Executive Summary

Application name and type
Industry sector
Internet exposure
Key findings overview
Risk summary

Application Description

Detailed application description
Architecture overview
Technology stack with versions
Authentication methods
Data sensitivity classification

STRIDE Threat Model

All 18 threats (3 per category)
Threat scenarios
Assumptions (Role, Condition)
Potential impacts
MITRE ATT&CK keywords

MITRE ATT&CK Analysis

Mapped techniques with IDs
Technique descriptions
Attack pattern IDs
Links to ATT&CK framework

Vulnerability Assessment

NVD CVEs for technology stack
CVSS scores and severity
CVE descriptions
Affected components

Threat Intelligence

AlienVault OTX pulses
Industry-specific threats
Adversary groups
Malware families
Indicators of Compromise (IOCs)

Mitigation Strategies

Specific mitigations per threat
Implementation guidance
Best practices
Control recommendations

DREAD Risk Assessment

Risk scores (1-10) for each threat
Damage, Reproducibility, Exploitability, Affected Users, Discoverability
Overall risk score
Prioritized threat list

Attack Trees

Mermaid diagram visualization
Hierarchical attack paths
STRIDE category organization
Sub-threats and attack vectors

Security Test Cases

Gherkin-formatted test scenarios
Given-When-Then structure
Threat-specific validation
Ready for test framework integration

Implementation

Dependencies

From step7_generate_pdf.py:17-24

import base64
import io
import json
from datetime import datetime
import markdown2
from xhtml2pdf import pisa

xhtml2pdf requires Cairo and Pango libraries for PDF rendering. See Installation for setup.

Core Conversion Functions

convert_markdown_to_html()

Basic markdown to HTML conversion:

step7_generate_pdf.py:103-114

def convert_markdown_to_html(markdown_text):
    return markdown2.markdown(markdown_text)

convert_markdown_to_html_desc()

Enhanced conversion with extras:

step7_generate_pdf.py:116-127

def convert_markdown_to_html_desc(markdown_text):
    return markdown2.markdown(
        markdown_text, 
        extras=["fenced-code-blocks", "tables", "strike", "cuddled-lists"]
    )

convert_stride_to_html_table()

Converts STRIDE threat model to HTML table:

def convert_stride_to_html_table(threat_model_json):
    html = '<table class="threat-table">'
    html += '<tr><th>Threat Type</th><th>Scenario</th><th>Impact</th><th>Assumptions</th></tr>'
    
    for threat in threat_model_json:
        assumptions_html = '<ul>'
        for assumption in threat.get('Assumptions', []):
            assumptions_html += f'<li><strong>{assumption.get("Assumption")}</strong> '
            assumptions_html += f'(Role: {assumption.get("Role")}, '
            assumptions_html += f'Condition: {assumption.get("Condition")})</li>'
        assumptions_html += '</ul>'
        
        html += f'<tr>'
        html += f'<td>{threat.get("Threat Type")}</td>'
        html += f'<td>{threat.get("Scenario")}</td>'
        html += f'<td>{threat.get("Potential Impact")}</td>'
        html += f'<td>{assumptions_html}</td>'
        html += '</tr>'
    
    html += '</table>'
    return html

format_gherkin_tests()

Formats Gherkin test cases with syntax highlighting:

def format_gherkin_tests(test_cases_markdown):
    # Convert markdown to HTML
    html = markdown2.markdown(test_cases_markdown, extras=["fenced-code-blocks"])
    
    # Apply Gherkin-specific styling
    html = html.replace('<code class="gherkin">', '<code class="gherkin-test">')
    
    return html

PDF Styling

The module includes comprehensive CSS styling:

PDF styles

<style>
    body {
        font-family: Arial, sans-serif;
        line-height: 1.6;
        color: #333;
        margin: 40px;
    }
    
    h1 {
        color: #1e3a8a;
        border-bottom: 3px solid #3b82f6;
        padding-bottom: 10px;
        page-break-after: avoid;
    }
    
    h2 {
        color: #1e3a8a;
        border-bottom: 2px solid #3b82f6;
        padding-bottom: 8px;
        page-break-after: avoid;
        margin-top: 30px;
    }
    
    table {
        width: 100%;
        border-collapse: collapse;
        margin: 20px 0;
        page-break-inside: avoid;
    }
    
    table th {
        background-color: #1e3a8a;
        color: white;
        padding: 12px;
        text-align: left;
        font-weight: bold;
    }
    
    table td {
        border: 1px solid #ddd;
        padding: 10px;
        vertical-align: top;
    }
    
    table tr:nth-child(even) {
        background-color: #f9fafb;
    }
    
    code {
        background-color: #f3f4f6;
        padding: 2px 6px;
        border-radius: 3px;
        font-family: 'Courier New', monospace;
        font-size: 0.9em;
    }
    
    pre {
        background-color: #1f2937;
        color: #f9fafb;
        padding: 15px;
        border-radius: 5px;
        overflow-x: auto;
        page-break-inside: avoid;
    }
    
    .gherkin-test {
        color: #10b981;
    }
</style>

The color scheme uses AegisShield’s brand colors: primary (#1e3a8a), light (#3b82f6), and dark (#0f172a).

Generation Process

PDF generation workflow

import streamlit as st
from xhtml2pdf import pisa
import markdown2
from datetime import datetime

def generate_pdf():
    # 1. Collect all data from session state
    app_description = st.session_state.get('app_input', '')
    threat_model = st.session_state.get('session_threat_model_json', [])
    mitre_data = st.session_state.get('mitre_attack_markdown', '')
    mitigations = st.session_state.get('session_mitigations_markdown', '')
    dread = st.session_state.get('session_dread_assessment_markdown', '')
    test_cases = st.session_state.get('session_test_cases_markdown', '')
    attack_tree = st.session_state.get('attack_tree_code', '')
    
    # 2. Convert sections to HTML
    html_parts = []
    
    # Title page
    html_parts.append(f'<h1>Threat Model Report</h1>')
    html_parts.append(f'<p><strong>Generated:</strong> {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}</p>')
    
    # Application description
    html_parts.append('<h2>Application Description</h2>')
    html_parts.append(convert_markdown_to_html_desc(app_description))
    
    # Threat model
    html_parts.append('<h2>STRIDE Threat Model</h2>')
    html_parts.append(convert_stride_to_html_table(threat_model))
    
    # MITRE ATT&CK
    html_parts.append(convert_mitre_attack_to_html(mitre_data))
    
    # Mitigations
    html_parts.append('<h2>Mitigation Strategies</h2>')
    html_parts.append(convert_markdown_to_html(mitigations))
    
    # DREAD
    html_parts.append('<h2>DREAD Risk Assessment</h2>')
    html_parts.append(convert_dread_to_html_table(dread))
    
    # Attack tree
    html_parts.append('<h2>Attack Tree</h2>')
    html_parts.append(f'<pre><code>{attack_tree}</code></pre>')
    
    # Test cases
    html_parts.append('<h2>Security Test Cases</h2>')
    html_parts.append(format_gherkin_tests(test_cases))
    
    # 3. Combine HTML with CSS
    full_html = f'''
    <html>
    <head>
        <style>
            {css_styles}
        </style>
    </head>
    <body>
        {''.join(html_parts)}
    </body>
    </html>
    '''
    
    # 4. Generate PDF
    output = io.BytesIO()
    pisa_status = pisa.CreatePDF(
        full_html.encode('utf-8'),
        dest=output
    )
    
    if pisa_status.err:
        raise Exception("PDF generation failed")
    
    # 5. Return PDF bytes
    pdf_bytes = output.getvalue()
    output.close()
    
    return pdf_bytes

Streamlit UI

The PDF generation tab provides a download button:

UI implementation

import streamlit as st
import base64

if st.button("Generate PDF Report"):
    with st.spinner("Generating PDF Report..."):
        try:
            # Generate PDF
            pdf_bytes = generate_pdf()
            
            # Encode for download
            b64_pdf = base64.b64encode(pdf_bytes).decode()
            
            # Create download link
            filename = f"threat_model_{datetime.now().strftime('%Y%m%d_%H%M%S')}.pdf"
            href = f'<a href="data:application/pdf;base64,{b64_pdf}" download="{filename}">Download PDF Report</a>'
            
            st.success("PDF generated successfully!")
            st.markdown(href, unsafe_allow_html=True)
            
        except Exception as e:
            st.error(f"Error generating PDF: {str(e)}")
            logger.error(f"PDF generation error: {str(e)}", exc_info=True)

Page Breaks

The module includes strategic page breaks:

Page break control

# Force page break before major sections
html = f'<div style="page-break-before: always;">{html}</div>'

# Avoid breaking tables
html = f'<div style="page-break-inside: avoid;">{html}</div>'

# Keep headings with content
h2_style = 'page-break-after: avoid;'

Page breaks ensure that tables, code blocks, and related content stay together in the PDF.

File Naming

Generated PDFs use timestamped filenames:

filename = f"threat_model_{datetime.now().strftime('%Y%m%d_%H%M%S')}.pdf"
# Example: threat_model_20240311_143052.pdf

Troubleshooting

Cairo/Pango Not Found

Symptom: OSError: cannot load library 'libcairo.so.2'Solution:

Ubuntu/Debian

sudo apt-get update
sudo apt-get install libcairo2-dev libpango1.0-dev
pip install --upgrade xhtml2pdf

macOS

brew install cairo pango
pip install --upgrade xhtml2pdf

Unicode Errors

Symptom: Characters not rendering correctly in PDF.Solution: The code uses UTF-8 encoding:

pisa.CreatePDF(full_html.encode('utf-8'), dest=output)

Ensure all input data is UTF-8 encoded.

Tables Broken Across Pages

Symptom: Tables split awkwardly across pages.Solution: Add page-break-inside: avoid to table styles:

table {
    page-break-inside: avoid;
}

Large PDF File Size

Symptom: PDF files larger than expected (>10 MB).Solution:

Remove embedded images if any
Limit attack tree complexity
Reduce number of threat model iterations

Customization

Custom Branding

Add your organization’s branding:

# Add logo to title page
logo_html = f'''
<div style="text-align: center;">
    <img src="data:image/png;base64,{logo_base64}" width="200">
    <h1>Threat Model Report</h1>
    <p><strong>{organization_name}</strong></p>
</div>
'''

Custom Sections

Add additional sections:

# Compliance section
html_parts.append('<h2>Compliance Mapping</h2>')
html_parts.append('<p>This threat model addresses the following compliance requirements:</p>')
html_parts.append('<ul>')
html_parts.append('<li><strong>NIST SP 800-53:</strong> AC-3, IA-5, SC-7, SI-4, AU-3</li>')
html_parts.append('<li><strong>GDPR:</strong> Article 32 (Security of Processing)</li>')
html_parts.append('</ul>')

Custom Styling

Modify the CSS to match your brand:

h1, h2 {
    color: #your-brand-color;
    border-bottom-color: #your-brand-color;
}

table th {
    background-color: #your-brand-color;
}

Session State Requirements

The PDF generator requires these session variables:

step6_completed

bool

required

All previous steps completed

app_input

str

required

Application description

session_threat_model_json

list

required

STRIDE threat model

mitre_attack_markdown

str

required

MITRE ATT&CK mappings

session_mitigations_markdown

str

required

Mitigation strategies

session_dread_assessment_markdown

str

required

DREAD risk assessment

session_test_cases_markdown

str

required

Security test cases

attack_tree_code

str

required

Attack tree diagram

Best Practices

Complete All Steps

Ensure all 6 previous steps are completed before generating PDF. Missing data will result in incomplete reports.

Review Before Export

Review each section in the UI tabs before generating the PDF. Make any necessary corrections.

Save Incrementally

Download intermediate outputs (threat model JSON, test cases) as you go. Don’t rely solely on the final PDF.

Version Control

Include PDF reports in version control alongside code for traceability.

7-Step Process - Complete workflow overview
Threat Modeling - STRIDE threat generation
Attack Trees - Diagram visualization
Test Cases - Gherkin test generation

Get Started

Core Features

Threat Intelligence

Workflows

Compliance

PDF Report Generation

PDF Report Generation

Overview

20-40 Pages

10 Sections

Professional Format

Report Structure

Implementation

Dependencies

Core Conversion Functions

PDF Styling

Generation Process

Streamlit UI

Page Breaks

File Naming

Troubleshooting

Customization

Custom Branding

Custom Sections

Custom Styling

Session State Requirements

Best Practices

Complete All Steps

Review Before Export

Save Incrementally

Version Control

Build docs developers (and LLMs) love

Get Started

Core Features

Threat Intelligence

Workflows

Compliance

​PDF Report Generation

​Overview

20-40 Pages

10 Sections

Professional Format

​Report Structure

​Implementation

​Dependencies

​Core Conversion Functions

​PDF Styling

​Generation Process

​Streamlit UI

​Page Breaks

​File Naming

​Troubleshooting

​Customization

​Custom Branding

​Custom Sections

​Custom Styling

​Session State Requirements

​Best Practices

Complete All Steps

Review Before Export

Save Incrementally

Version Control

​Related Features

Build docs developers (and LLMs) love

PDF Report Generation

Overview

Report Structure

Implementation

Dependencies

Core Conversion Functions

PDF Styling

Generation Process

Streamlit UI

Page Breaks

File Naming

Troubleshooting

Customization

Custom Branding

Custom Sections

Custom Styling

Session State Requirements

Best Practices

Related Features