Skip to main content

Fragmentation Architecture

TurkeyDPI implements packet fragmentation through a transform pipeline that operates on outgoing network data. The system is designed to be modular, configurable, and verifiable.

Core Fragmentation Engine

Fragment Transform

The primary fragmentation logic lives in the FragmentTransform struct:
// From engine/src/transform/fragment.rs:9-18
pub struct FragmentTransform {
    params: FragmentParams,
}

impl FragmentTransform {
    pub fn new(params: &FragmentParams) -> Self {
        Self {
            params: params.clone(),
        }
    }
}

Configuration Parameters

Fragmentation behavior is controlled by FragmentParams:
// Defined in engine/src/config.rs (referenced in fragment.rs)
pub struct FragmentParams {
    pub min_size: usize,           // Minimum fragment size
    pub max_size: usize,           // Maximum fragment size  
    pub split_at_offset: Option<usize>,  // Specific split point
    pub randomize: bool,           // Randomize fragment sizes
}
The split_at_offset parameter allows precise control over where data is split, enabling targeted fragmentation of protocol fields like SNI or Host headers.

Fragmentation Algorithm

Size Calculation

Fragment sizes can be fixed or pseudo-randomized:
// From engine/src/transform/fragment.rs:20-32
fn calculate_fragment_size(&self, remaining: usize) -> usize {
    if self.params.randomize {
        let range = self.params.max_size - self.params.min_size;
        if range == 0 {
            self.params.min_size
        } else {
            // Deterministic pseudo-random based on data size
            let pseudo_random = (remaining * 31337) % (range + 1);
            self.params.min_size + pseudo_random
        }
    } else {
        self.params.max_size
    }
}
The pseudo-random algorithm is deterministic (no true RNG) to ensure reproducible behavior for testing and debugging.

Data Fragmentation

The core fragmentation logic handles two modes:
// From engine/src/transform/fragment.rs:34-57
pub fn fragment_data(&self, data: &[u8]) -> Vec<BytesMut> {
    let mut fragments = Vec::new();
    let mut offset = 0;
    
    // Mode 1: Split at specific offset (for SNI/Host targeting)
    if let Some(split_at) = self.params.split_at_offset {
        if split_at > 0 && split_at < data.len() {
            let first = BytesMut::from(&data[..split_at]);
            let second = BytesMut::from(&data[split_at..]);
            fragments.push(first);
            fragments.push(second);
            return fragments;
        }
    }
    
    // Mode 2: Chunk into multiple segments
    while offset < data.len() {
        let remaining = data.len() - offset;
        let size = self.calculate_fragment_size(remaining).min(remaining);
        
        let fragment = BytesMut::from(&data[offset..offset + size]);
        fragments.push(fragment);
        offset += size;
    }

    fragments
}

Mode 1: Targeted Split

When split_at_offset is set, creates exactly 2 fragments:
Input:  [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
                ↑ split_at_offset = 5
Output:
  Fragment 1: [0 1 2 3 4]
  Fragment 2: [5 6 7 8 9 10 11 12 13 14 15]
This is used for SNI/Host header fragmentation where we know the exact byte offset.

Mode 2: Size-Based Chunking

When no offset is specified, splits data into chunks of max_size (or randomized):
Input:  [50 bytes]
max_size: 5

Output:
  Fragment 1: [5 bytes]
  Fragment 2: [5 bytes]
  Fragment 3: [5 bytes]
  ...
  Fragment 10: [5 bytes]

Transform Pipeline Integration

Flow Context

Fragments are emitted into a FlowContext that manages output:
// From engine/src/transform/fragment.rs:65-104
impl Transform for FragmentTransform {
    fn apply(&self, ctx: &mut FlowContext<'_>, data: &mut BytesMut) 
            -> Result<TransformResult> {
        // Skip if packet is too small
        if data.len() <= self.params.min_size {
            trace!(
                flow = ?ctx.key,
                size = data.len(),
                "packet too small to fragment"
            );
            return Ok(TransformResult::Continue);
        }

        let fragments = self.fragment_data(data);
        
        if fragments.len() <= 1 {
            return Ok(TransformResult::Continue);
        }

        debug!(
            flow = ?ctx.key,
            original_size = data.len(),
            fragments = fragments.len(),
            "fragmented packet"
        );

        // Track statistics
        ctx.state.transform_state.fragment.fragments_generated += 
            fragments.len() as u32;

        // Emit fragments
        for (i, fragment) in fragments.into_iter().enumerate() {
            if i == 0 {
                // Replace input buffer with first fragment
                data.clear();
                data.extend_from_slice(&fragment);
            } else {
                // Emit subsequent fragments to output queue
                ctx.emit(fragment);
            }
        }

        Ok(TransformResult::Fragmented)
    }
}

Key Design Decisions

Instead of creating entirely new packets, the first fragment overwrites the input buffer (data).Why:
  • Avoids extra allocation
  • Maintains packet metadata (sequence numbers, etc.)
  • Preserves ordering naturally
Additional fragments are emitted via ctx.emit(fragment) into an output queue.Why:
  • Allows pipeline to process first fragment further
  • Defers transmission for timing control
  • Enables interleaving with other transforms
ctx.state.transform_state.fragment.fragments_generated counts total fragments.Purpose:
  • Debugging and monitoring
  • Performance analysis
  • User feedback (“X packets fragmented”)

Bypass Engine Integration

The higher-level BypassEngine uses fragmentation for specific protocols:

TLS Fragmentation

// From engine/src/bypass.rs:165-224 (abbreviated)
fn process_tls_client_hello(&self, data: &[u8], result: &mut BypassResult) {
    if !self.config.fragment_sni {
        result.fragments.push(Bytes::copy_from_slice(data));
        return;
    }
    
    if let Some(info) = parse_client_hello(data) {
        result.hostname = info.sni_hostname.clone();
        
        // Determine split position
        let split_pos = if self.config.tls_split_pos > 0 {
            // Fixed header position
            self.config.tls_split_pos.min(data.len() - 1)
        } else if let (Some(sni_off), Some(sni_len)) = 
                  (info.sni_offset, info.sni_length) {
            // Middle of SNI hostname
            if sni_len > 2 {
                sni_off + (sni_len / 2)
            } else {
                sni_off
            }.min(data.len() - 1)
        } else {
            5.min(data.len() - 1)  // Default: after TLS header
        };
        
        // Apply segmentation if needed
        let segment_size = self.config.max_segment_size.max(1);
        
        if segment_size < split_pos {
            // Multi-segment first part
            let mut pos = 0;
            while pos < split_pos {
                let end = (pos + segment_size).min(split_pos);
                result.fragments.push(
                    Bytes::copy_from_slice(&data[pos..end])
                );
                pos = end;
            }
            // Second part as single fragment
            result.fragments.push(
                Bytes::copy_from_slice(&data[split_pos..])
            );
        } else {
            // Simple two-way split
            result.fragments.push(
                Bytes::copy_from_slice(&data[..split_pos])
            );
            result.fragments.push(
                Bytes::copy_from_slice(&data[split_pos..])
            );
        }
        
        result.modified = true;
        
        // Add timing delays if configured
        if self.config.fragment_delay_us > 0 {
            result.inter_fragment_delay = Some(
                Duration::from_micros(self.config.fragment_delay_us)
            );
        }
    }
}

Multi-Level Fragmentation

Notice the nested logic:
  1. Primary split at split_pos (SNI location or header boundary)
  2. Secondary segmentation if max_segment_size is small
Example with split_pos=50, max_segment_size=10:
Original: [100 bytes]

Step 1: Split at 50
  Part A: [0..50]  (50 bytes)
  Part B: [50..100] (50 bytes)

Step 2: Segment Part A
  Fragment 1: [0..10]   (10 bytes)
  Fragment 2: [10..20]  (10 bytes)
  Fragment 3: [20..30]  (10 bytes)
  Fragment 4: [30..40]  (10 bytes)
  Fragment 5: [40..50]  (10 bytes)

Step 3: Keep Part B whole
  Fragment 6: [50..100] (50 bytes)

Final: 6 fragments
This creates asymmetric fragmentation: heavy splitting before the SNI, minimal splitting after.
Why asymmetric? The critical data (SNI hostname) is in the first part. Once that’s fragmented, the rest can be sent efficiently.

Timing Control

Inter-Fragment Delays

The BypassResult includes timing information:
// From engine/src/bypass.rs:108-116
pub struct BypassResult {
    pub fragments: Vec<Bytes>,
    pub inter_fragment_delay: Option<Duration>,  // ← Delay between sends
    pub fake_packet: Option<Bytes>,
    pub modified: bool,
    pub protocol: DetectedProtocol,
    pub hostname: Option<String>,
}
The proxy layer (not shown in these files) respects inter_fragment_delay when sending fragments:
// Pseudocode from proxy layer
for fragment in result.fragments {
    send(fragment).await;
    
    if let Some(delay) = result.inter_fragment_delay {
        tokio::time::sleep(delay).await;  // Delay before next fragment
    }
}

Delay Values in Presets

fragment_delay_us: 0  // No delay
Immediate transmission. DPI is stateless.

Data Integrity Verification

Reassembly Testing

Every fragmentation operation is verified to be lossless:
// From engine/src/transform/fragment.rs:213-238
#[test]
fn test_fragment_preserves_all_data() {
    let params = FragmentParams {
        min_size: 3,
        max_size: 7,
        split_at_offset: None,
        randomize: false,
    };
    let transform = FragmentTransform::new(&params);

    let key = test_flow_key();
    let mut state = FlowState::new(key);
    let mut ctx = test_context(&key, &mut state);
    let original = b"The quick brown fox jumps over the lazy dog";
    let mut data = BytesMut::from(&original[..]);

    let _ = transform.apply(&mut ctx, &mut data);

    // Collect all fragments
    let mut all_data = data.to_vec();
    for packet in &ctx.output_packets {
        all_data.extend_from_slice(packet);
    }

    // Verify byte-for-byte identity
    assert_eq!(all_data.as_slice(), original);
}
Critical invariant: concat(fragments) == original_dataIf this fails, TLS handshakes will be rejected by servers.

Performance Characteristics

Time Complexity

  • Targeted split (split_at_offset): O(n) - one pass to copy data
  • Size-based chunking: O(n) - one pass with multiple copies
  • Parsing TLS: O(n) - single pass through ClientHello structure
Where n = packet size (typically 200-2000 bytes for ClientHello).

Space Complexity

  • Temporary fragment buffers: O(n) total
  • No persistent state between packets
  • Zero-copy where possible (using Bytes reference counting)

Practical Performance

Typical packet processing:
  • TLS ClientHello parse: ~2-5μs
  • Fragmentation (2 fragments): ~1-2μs
  • Fragmentation (10 fragments): ~3-5μs
Total overhead: <10μs per connection
  • TLS handshake: 1-2 RTTs already (~20-200ms over internet)
  • Fragmentation: <0.01ms processing
  • Delays (if used): 0.1-10ms configured
Overall: 0.1-10ms added latency per connection (dominated by configured delays, not processing)

Edge Cases and Safeguards

Minimum Size Protection

// From engine/src/transform/fragment.rs:67-73
if data.len() <= self.params.min_size {
    trace!(
        flow = ?ctx.key,
        size = data.len(),
        "packet too small to fragment"
    );
    return Ok(TransformResult::Continue);
}
Prevents fragmenting tiny packets that would create overhead without benefit.

Boundary Validation

// From engine/src/transform/fragment.rs:38-44
if let Some(split_at) = self.params.split_at_offset {
    if split_at > 0 && split_at < data.len() {  // ← Bounds check
        // Safe to split
    }
}
Ensures split points are within valid ranges:
  • split_at > 0: Don’t create empty first fragment
  • split_at < data.len(): Don’t overflow buffer

Parser Robustness

The TLS parser handles truncated/malformed data gracefully:
// From engine/src/tls.rs:77-86
if data.len() < 6 {
    return None;  // Too short to be valid
}

let content_type = data[pos];
if content_type != TLS_HANDSHAKE {
    return None;  // Not a handshake
}

// ... bounds checking throughout ...

if pos + 2 > data.len() {
    return Some(info);  // Return partial info instead of crashing
}
Returns None or partial ClientHelloInfo instead of panicking on bad input.

Example: Complete Fragmentation Flow

Let’s trace a real discord.com TLS ClientHello:

Input Packet

// From engine/src/bypass.rs:301-318 (test data)
vec![
    0x16, 0x03, 0x01, 0x00, 0x5a,  // TLS record header
    0x01, 0x00, 0x00, 0x56,        // Handshake: ClientHello
    0x03, 0x03,                     // TLS version 1.2
    // ... random, session ID, ciphers ...
    0x00, 0x00, 0x00, 0x10,        // SNI extension
    0x00, 0x0e, 0x00, 0x00, 0x0b,  // SNI list, hostname type, length=11
    0x64, 0x69, 0x73, 0x63,        // "disc"
    0x6f, 0x72, 0x64, 0x2e,        // "ord."
    0x63, 0x6f, 0x6d,              // "com"
    // ...
]

Parsing

let info = parse_client_hello(&data);
// Returns:
ClientHelloInfo {
    sni_offset: Some(XX),       // Byte offset to "discord.com"
    sni_length: Some(11),       // Length of "discord.com"
    sni_hostname: Some("discord.com"),
    // ...
}

Fragmentation (Aggressive Preset)

Config: {
    tls_split_pos: 0,           // Use SNI split
    max_segment_size: 5,        // 5-byte max
}

SNI position: offset=XX, length=11
Split at: XX + 11/2 = XX + 5  (middle of "discord.com")

Output Fragments

Fragment 1: [bytes 0 to 4]         (5 bytes)
Fragment 2: [bytes 5 to 9]         (5 bytes)
Fragment 3: [bytes 10 to XX+4]     (variable)
Fragment 4: [bytes XX+5 to XX+10]  (6 bytes) ← contains "d.com"
Fragment 5: [remaining bytes]      (variable)

Transmission

for fragment in fragments {
    socket.send(fragment).await;
    tokio::time::sleep(Duration::from_micros(10000)).await;  // 10ms delay
}

Total delay: 4 * 10ms = 40ms added to handshake

See Also

Build docs developers (and LLMs) love