Skip to main content
Optimizing your patches ensures smooth real-time performance, reduces CPU usage, and minimizes audio dropouts. This guide covers practical techniques for efficient patch design.

Understanding DSP Load

Monitoring CPU Usage

Plugdata displays DSP load in the status bar. High CPU usage (>80%) can cause:
  • Audio dropouts and clicks
  • Increased latency
  • Unstable performance

Block Size Impact

Audio processing happens in blocks. Smaller blocks = lower latency but higher CPU usage:
Block SizeLatency @ 44.1kHzCPU LoadUse Case
64~1.5 msHighLive performance, low latency
128~3 msMediumGeneral use
256~6 msLowStudio production
512~12 msVery LowNon-real-time processing
Access via: Settings → Audio Settings → Block Size

Efficient Patch Design

1. Minimize Object Count

❌ Inefficient:
[osc~ 440]
|
[*~ 0.1]
|
[+~ 0]
|
[*~ 1]
|
[dac~]
✅ Optimized:
[osc~ 440]
|
[*~ 0.1]
|
[dac~]
Every object has overhead. Remove unnecessary processing.

2. Use Block-Level Processing

Process signals at block rate instead of sample rate when possible: ❌ Sample-rate processing:
[snapshot~]  <- Every sample (expensive)
✅ Block-rate processing:
[env~]       <- Block-rate envelope follower

3. Avoid Redundant Calculations

❌ Recalculating repeatedly:
[expr 440 * 2]
|
[osc~]

[expr 440 * 2]
|
[osc~]
✅ Calculate once:
[440(
|
[* 2]
|
[t b f f]  <- Trigger and distribute
|   |   |
|   |   |
[osc~]  [osc~]

4. Use Efficient Objects

Some objects are more optimized than others:
Instead ofUseReason
[expr~][*~], [+~]Built-in operators are faster
[fexpr~]Native filtersCompiled filters are optimized
Multiple [line~][vline~]Handles multiple ramps efficiently
Example: ❌ Slower:
[expr~ $v1 * 0.5 + $v2 * 0.3]
✅ Faster:
[*~ 0.5]  [*~ 0.3]
|         |
+---------+
|
[+~]

Subpatch Optimization

Use Subpatches for Organization

Subpatches don’t add overhead and improve readability:
[pd synthesis-engine]
|
[pd effects-chain]
|
[pd output-stage]

Block Size Adjustment

Use [block~] to optimize specific subpatches:
synthesis-engine
[block~ 128]  <- Larger blocks for synthesis

[phasor~ 100]
|
[*~ 0.5]
|
[outlet~]
effects-chain
[block~ 64]   <- Smaller blocks for effects

[inlet~]
|
[lop~ 1000]
|
[outlet~]
[block~] must be a multiple or divisor of the parent block size.

Switch~ for Conditional DSP

Disable DSP in inactive subpatches:
[switch~ oscillator-bank 1(  <- Enable
[switch~ oscillator-bank 0(  <- Disable (saves CPU)

Signal Processing Tips

1. Pre-compute Static Values

❌ Computing every block:
[osc~ 440]
|
[*~ 0.707]  <- Computed every block
✅ Load from message:
[loadbang]
|
[0.707(
|
[sig~]  <- Converted once
|
[*~]
For dynamic but infrequent changes:
[r volume]     <- Receive volume changes
|
[line~ 0 10]   <- Smooth over 10ms
|
[*~]           <- Multiply signal

2. Table Lookups

Use tables for complex functions: ❌ Expensive calculation:
[expr~ sin($v1 * 6.28318)]  <- Computed per sample
✅ Table lookup:
[tabosc4~ sine-table]  <- Fast table lookup with interpolation

3. Filter Optimization

Choose appropriate filter types:
FilterCPUQualityUse Case
[lop~] / [hip~]LowBasicSimple filtering
[vcf~]MediumGoodVoltage-controlled filter
[biquad~]MediumHighPrecise EQ
[fexpr~]HighCustomOnly when necessary

Message Domain Optimization

1. Reduce Message Rate

❌ Flooding with messages:
[metro 1]  <- 1000 messages/second
✅ Appropriate rate:
[metro 50]  <- 20 messages/second (often sufficient)

2. Use [trigger] Efficiently

Control message order and avoid fan-out issues:
[trigger b f f s]  <- Explicit, right-to-left order

3. Throttle GUI Updates

❌ Updating every message:
[metro 1]
|
[random 100]
|
[nbx]  <- GUI update every 1ms (expensive)
✅ Throttled updates:
[metro 1]        [metro 50]  <- Separate timers
|                |
[random 100]     |
|                |
[t b f]----------+  
  |              |
  [compute]      [nbx]  <- GUI updates at 20 Hz

Profiling with Perfetto

For advanced optimization, use Perfetto profiling:
1
Step 1: Enable Perfetto
2
Build plugdata with profiling enabled:
3
cmake .. -DENABLE_PERFETTO=1 -DCMAKE_BUILD_TYPE=Release
cmake --build .
4
From CMakeLists.txt:37, 145-147, 329, 417-420, 840-854
5
Step 2: Instrument Your Code
6
Add trace points to C++ code:
7
void myProcessingFunction() {
    TRACE_COMPONENT();  // Traces this function
    
    // Your processing code
}
8
Or use automated instrumentation:
9
python3 Resources/Scripts/add_perfetto_tracepoints.py Source/**/*.cpp
10
Requires LLVM >= 19.1.3 and compatible libclang.
11
Step 3: Capture Trace
12
  • Run plugdata
  • Load and run your patch
  • Capture trace file
  • Open at ui.perfetto.dev
  • 13
    Step 4: Analyze Results
    14
    Look for:
    15
  • Long-running functions
  • Unexpected CPU spikes
  • Blocking operations
  • Memory allocations in DSP
  • Memory Management

    1. Avoid Memory Allocation in DSP

    ❌ Allocating in perform routine:
    t_int *perform(t_int *w) {
        float *buffer = malloc(1024);  // BAD!
        // ...
        free(buffer);
    }
    
    ✅ Pre-allocate in constructor:
    void *new(void) {
        t_myobj *x = malloc(sizeof(t_myobj));
        x->buffer = malloc(1024);  // Allocate once
        return x;
    }
    

    2. Use Fixed-Size Buffers

    #define MAX_BUFFER 4096
    
    typedef struct {
        float buffer[MAX_BUFFER];  // Static allocation
    } t_myobj;
    

    Audio Thread Priority

    Understanding Real-Time Priority

    Plugdata runs audio in a high-priority thread. Keep audio callbacks fast:
    • ⏱️ Target: < 50% of block time
    • ⚠️ Warning: > 80% may cause dropouts
    • Critical: > 100% = guaranteed dropouts

    Don’t Block Audio Thread

    ❌ Never do in DSP:
    • File I/O
    • Network operations
    • GUI updates
    • Memory allocation
    • Mutex locks (if avoidable)
    • sleep() or delays
    ✅ Do instead:
    • Use lock-free queues for communication
    • Pre-load data in non-real-time thread
    • Defer non-critical work

    Platform-Specific Optimization

    macOS

    Metal Rendering: Enabled by default for better graphics performance (CMakeLists.txt:53, 376-382). Disable if needed:
    cmake .. -DNANOVG_METAL_IMPLEMENTATION=OFF
    

    Windows

    Use ASIO for lowest latency (automatically included, CMakeLists.txt:320, 385).

    Linux

    JACK provides better performance than ALSA for real-time audio:
    • Lower latency
    • Better CPU scheduling
    • Inter-application routing
    Enabled by default (CMakeLists.txt:308, 313).

    Benchmark Example

    Create test patches to measure performance:
    test-oscillators.pd
    # Test 1: Single oscillator
    [loadbang]
    |
    [1(
    |
    [until]
    |
    [osc~ 440]  
    |
    [dac~]
    
    # Test 2: 100 oscillators  
    [loadbang]
    |
    [100(
    |
    [until]
    |
    [pd voice]
      |
      [osc~ 440]
      |
      [outlet~]
    |
    [dac~]
    
    Monitor CPU load to compare approaches.

    Quick Optimization Checklist

    • Remove unnecessary objects
    • Use efficient alternatives ([*] vs [expr])
    • Avoid redundant calculations
    • Use appropriate block sizes
    • Throttle GUI updates
    • Pre-compute static values
    • Use [switch~] for inactive sections
    • Profile with Perfetto (if needed)
    • Check DSP load in status bar
    • Test on target hardware

    Common Bottlenecks

    1. Too Many [expr~] Objects

    Problem: Each [expr~] is interpreted at runtime. Solution: Replace with native objects or write C external.

    2. Excessive GUI Objects

    Problem: VU meters, scopes updating every block. Solution: Update at lower rate (20-30 Hz).

    3. Large FFT Operations

    Problem: FFT size too large for block size. Solution: Use power-of-2 sizes, increase block size in subpatch.

    4. Many Small Subpatches

    Problem: Overhead from subpatch management. Solution: Consolidate when possible, but balance with readability.

    Resources

    • Perfetto profiling: CMakeLists.txt:37, Resources/Scripts/add_perfetto_tracepoints.py
    • Block size settings: CMakeLists.txt:31, 113-117
    • Audio backend config: CMakeLists.txt:308-321
    • DSP optimization guide: Pure Data Documentation

    Next Steps

    Building from Source

    Compile optimized builds

    Adding Externals

    Create optimized C externals

    Build docs developers (and LLMs) love