Skip to main content

Data Flow Diagram (DFD) Generation

Generate Yourdon-DeMarco Data Flow Diagrams using structured analysis notation.

Command

arckit dfd "<system or process>"

Description

Creates Data Flow Diagrams (DFDs) using Yourdon-DeMarco structured analysis notation. Shows how data moves through your system with processes, data stores, and external entities.

Arguments

  • System or process: Name of system or process to diagram (e.g., ‘user registration’, ‘claims processing’)

When to Use

  • For data requirements analysis
  • GDPR / UK GDPR compliance documentation
  • PII handling and data residency analysis
  • Data transformation pipelines
  • System integration planning

Required Context

  • Requirements (ARC--REQ-.md) - For DR, INT, FR
  • Data Model (ARC--DATA-.md) - For entities and relationships
  • Architecture Principles (PRIN) - For data governance
  • Architecture Diagrams (DIAG) - For component context

Interactive Configuration

Prompts for DFD level:
  1. Context Diagram (Level 0) (Recommended) - Single process, system boundary
  2. Level 1 DFD - Decompose into major sub-processes
  3. Level 2 DFD - Detailed process decomposition
  4. All Levels (0-1) - Generate both Context and Level 1

Yourdon-DeMarco Notation

SymbolShapeDescription
External EntityRectangleSource/sink outside system
ProcessCircle (bubble)Transforms data
Data StoreParallel linesData at rest
Data FlowNamed arrowData in motion

DFD Levels

Level 0 (Context Diagram)

Purpose: Show entire system as single process with external entities Example:
Customer → Payment System → Bank
Merchant → Payment System → Merchant

Level 1

Purpose: Decompose system into major sub-processes with data stores Processes numbered: 1, 2, 3, etc. Data stores numbered: D1, D2, D3, etc.

Level 2+

Purpose: Further decompose specific Level 1 processes Processes numbered: 1.1, 1.2, 1.3 (for Process 1), etc.

Output Formats

Format 1: data-flow-diagram DSL

Text format for the data-flow-diagram Python tool:
title Level 0 - Payment System

entity CUST "Customer"
entity BANK "Bank System"
process P0 "Payment System"

CUST --> P0 "Payment Request"
P0 --> CUST "Confirmation"
P0 --> BANK "Transaction"
BANK --> P0 "Response"
Renders with: pip install data-flow-diagram && dfd < file.dfd

Format 2: Mermaid (Approximate)

Flowchart approximation for inline rendering:

Document Contents

Generated document includes:
  1. DFD in both formats (DSL + Mermaid)
  2. Process specifications table
  3. Data store descriptions
  4. Data dictionary (all data flows defined)
  5. Requirements traceability

Process Specification Table

ProcessNameInputsOutputsLogicRequirements
1Validate PaymentPayment Request, Customer DetailsValidated Payment, ErrorValidate card, verify customerFR-001, DR-002

Data Store Table

StoreNameContentsAccessRetentionPII
D1Transaction LogTransaction ID, amount, statusR/W by P2, R by P37 yearsNo
D2Customer RecordsName, email, card tokenR by P1, W by P2Account lifetimeYes

Validation Rules

Yourdon-DeMarco compliance:
  • ✅ Every process has at least one input AND one output
  • ✅ No “black holes” (only inputs) or “miracles” (only outputs)
  • ✅ Data stores have at least one read and one write
  • ✅ All data flows are named
  • ✅ External entities only connect to processes
  • ✅ Process numbering is consistent across levels
  • ✅ Level 1 decomposes from Level 0
  • ✅ Data flows balance across levels

Output

Creates: projects/{project}/diagrams/ARC-{PROJECT_ID}-DFD-{NNN}-v1.0.md

Naming Conventions

  • Processes: Verb + Noun (“Validate Payment”, “Process Order”)
  • Data Stores: Plural noun (“Customer Records”, “Transaction Log”)
  • External Entities: Specific role (“Customer”, “Payment Gateway”)
  • Data Flows: Noun phrase (“Payment Details”, “Order Confirmation”)
  • arckit diagram - Generate C4 or deployment diagrams
  • arckit data-model - Create formal data model
  • arckit requirements - Source data requirements

Next Steps

After creating DFD:
  1. Render with data-flow-diagram CLI for true notation
  2. Or view Mermaid version in GitHub/VS Code
  3. Validate all flows are named
  4. Check process balance (no black holes/miracles)
  5. Create Level 2 for complex processes
  6. Use for GDPR/data protection documentation

Build docs developers (and LLMs) love