Skip to main content

Overview

The Vitis backend enables deployment of neural networks on AMD/Xilinx FPGAs using the Vitis HLS compiler. It is the recommended backend for new AMD/Xilinx FPGA projects, replacing the discontinued Vivado HLS compiler.

When to Use Vitis Backend

  • New AMD/Xilinx FPGA projects: Modern toolchain with active development
  • Latest FPGA devices: Support for Versal, UltraScale+, and newer devices
  • Advanced features: Access to latest hls4ml optimizations and features
  • IP integration: Generate IP cores for Vivado designs
The Vitis backend inherits most functionality from the Vivado backend but includes additional validation passes and optimizations for the Vitis HLS toolchain.

Installation and Setup

Prerequisites

  • Vitis HLS 2020.1 or later (ensure vitis-run is on PATH)
  • AMD/Xilinx Vitis development tools
  • Python 3.8 or higher
  • hls4ml library installed

Environment Setup

# Verify Vitis is available
command -v vitis-run

# Source Vitis settings (adjust path for your installation)
source /tools/Xilinx/Vitis/<version>/settings64.sh

Configuration

Basic Configuration

Create a model configuration for the Vitis backend:
import hls4ml

config = hls4ml.utils.config_from_keras_model(
    model,
    granularity='name',
    backend='Vitis'
)

# Convert model
hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=config,
    output_dir='my_vitis_project',
    backend='Vitis',
    part='xcvu13p-flga2577-2-e',
    clock_period=5,
    io_type='io_parallel'
)

Configuration Options

The Vitis backend supports the following configuration parameters:
part
string
default:"xcvu13p-flga2577-2-e"
FPGA part number to target (UltraScale+, Versal, etc.)
clock_period
int
default:"5"
Clock period in nanoseconds (5ns = 200MHz)
clock_uncertainty
string
default:"27%"
Clock uncertainty percentage (higher than Vivado default)
io_type
string
default:"io_parallel"
I/O implementation type:
  • io_parallel: Parallel processing of all data
  • io_stream: Streaming dataflow architecture
namespace
string
default:"None"
Optional C++ namespace for generated code
write_weights_txt
bool
default:"true"
Write weights to .txt files for faster compilation
write_tar
bool
default:"false"
Compress output directory into .tar.gz file
write_emulation_constants
bool
default:"false"
Write constants to define.h for emulation support

Vitis-Specific Validation

The Vitis backend includes validation passes that check:
# Automatic validation checks:
# - Conv implementation compatibility
# - Resource strategy constraints
# - Resource_unrolled strategy validation
# - Bidirectional layer merge mode
# - Bidirectional I/O type compatibility
# - Standard C++ types usage

Layer Configuration

Dense Layers

config['dense_layer'] = {
    'ReuseFactor': 16,
    'Strategy': 'Resource',  # 'Latency', 'Resource', 'Resource_Unrolled'
    'Precision': 'ap_fixed<16,6>',
    'accum_t': 'ap_fixed<24,12>'
}

Convolutional Layers

config['conv2d_layer'] = {
    'ReuseFactor': 8,
    'Strategy': 'Resource',
    'ParallelizationFactor': 4,
    'ConvImplementation': 'LineBuffer',  # or 'Encoded'
    'Precision': 'ap_fixed<16,6>'
}

Recurrent Layers

config['lstm_layer'] = {
    'ReuseFactor': 1,
    'RecurrentReuseFactor': 1,
    'Strategy': 'Resource',
    'static': True,  # Static loop implementation
    'table_size': 1024,
    'table_t': 'ap_fixed<18,8>'
}

Build Process

Synthesis Commands

# Compile the model
hls_model.compile()

# Build with Vitis HLS
report = hls_model.build(
    reset=False,       # Reset project
    csim=True,         # C simulation
    synth=True,        # HLS synthesis
    cosim=False,       # RTL co-simulation
    validation=False,  # Validation
    export=False,      # Export IP
    vsynth=False,      # Vivado synthesis
    fifo_opt=False     # FIFO optimization
)

Build Options

OptionDescriptionDefault
resetReset project before buildingFalse
csimRun C simulation testbenchTrue
synthRun HLS synthesisTrue
cosimRun RTL co-simulationFalse
validationRun validation testsFalse
exportExport IP for Vivado integrationFalse
vsynthRun Vivado synthesisFalse
fifo_optOptimize FIFO depthsFalse
log_to_stdoutPrint build logs to stdoutTrue

Build Script

Vitis uses a different build flow than Vivado:
cd my_vitis_project
vitis-run --tcl build_prj.tcl --mode hls
The build process generates logs:
  • build_stdout.log - Standard output
  • build_stderr.log - Error messages

Example Project Structure

my_vitis_project/
├── firmware/
│   ├── myproject.cpp          # Top-level implementation
│   ├── myproject.h            # Header declarations
│   ├── parameters.h           # Network parameters
│   ├── defines.h              # Macro definitions
│   ├── weights/               # Weight files
│   └── nnet_utils/            # Utility functions
├── tb_data/
│   ├── tb_input_features.dat
│   └── tb_output_predictions.dat
├── myproject_test.cpp         # Testbench
├── build_prj.tcl              # Vitis HLS script
├── build_opt.tcl              # Build options
├── build_stdout.log           # Build output
├── build_stderr.log           # Build errors
└── proj_myproject/            # Vitis HLS project
    ├── solution1/
    │   ├── syn/
    │   │   └── report/        # Synthesis reports
    │   ├── impl/              # Implementation
    │   └── .autopilot/        # Vitis metadata
    └── vitis_hls.log

Advanced Features

Stitched Design Flow

For partitioned large models:
# Build individual graph components
for graph in model.graphs:
    graph_report = graph.build(synth=True, export=True)

# Stitch together into complete design
stitched_report = model.build_stitched_design(
    stitch_design=True,
    sim_stitched_design=True,
    export_stitched_design=True,
    simulation_input_data=X_test
)

Streaming Architecture

Optimize for throughput with streaming:
config['Model']['IOType'] = 'io_stream'
config['Model']['Strategy'] = 'Resource'

# Build and optimize FIFOs
report = hls_model.build(fifo_opt=True)

Custom Pragmas

Vitis supports advanced HLS pragmas:
# These are applied automatically based on strategy
# - DATAFLOW for streaming designs
# - PIPELINE for latency optimization
# - ARRAY_PARTITION for parallel access

Performance Optimization

Resource vs Latency Tradeoff

# Low latency, high resources
config['Model']['Strategy'] = 'Latency'
config['LayerName']['ReuseFactor'] = 1

# Balanced
config['Model']['Strategy'] = 'Resource'
config['LayerName']['ReuseFactor'] = 8

# Low resources, higher latency
config['Model']['Strategy'] = 'Resource'
config['LayerName']['ReuseFactor'] = 64

Precision Tuning

# Use hls4ml profiling to determine optimal precision
trace = hls_model.trace(X_test)

# Apply precision recommendations
optimal_config = hls4ml.utils.get_optimal_precision(
    model, trace, target_precision='auto'
)

Clock Frequency Optimization

# Start conservative
clock_period = 10  # 100 MHz

# Gradually reduce for higher performance
clock_period = 5   # 200 MHz
clock_period = 4   # 250 MHz

# Monitor timing reports and adjust

Performance Characteristics

Typical Resource Usage

Small MLP (3 layers, 64 neurons):
  • LUTs: 5K-15K
  • FFs: 3K-10K
  • DSPs: 10-30
  • BRAM: 5-20
CNN (5 conv layers + 2 dense):
  • LUTs: 50K-200K
  • FFs: 30K-150K
  • DSPs: 100-500
  • BRAM: 50-300

Latency Patterns

io_parallel:
Latency = Σ(layer_latency × reuse_factor)
II = 1 (for pipelined designs)
io_stream:
Throughput = 1 / max(layer_II)
End-to-end latency = Σ(layer_latency)

Achievable Frequencies

  • Versal devices: 300-500 MHz
  • UltraScale+: 200-350 MHz
  • Older devices: 150-250 MHz

Differences from Vivado Backend

FeatureVivadoVitis
Compilervivado_hlsvitis-run
Clock Uncertainty12.5%27%
Validation PassesBasicEnhanced
Build CommandTCL argsSeparate config file
Log ManagementSingle logstdout/stderr logs
FIFO OptimizationSupportedEnhanced support
Emulation ConstantsNot supportedSupported

Troubleshooting

# Check Vitis installation
which vitis-run

# Source Vitis environment
source /tools/Xilinx/Vitis/2023.1/settings64.sh

# Verify version
vitis-run --version
The Vitis backend includes strict validation:
  • Check that Resource_Unrolled strategy is only used with io_stream
  • Verify Bidirectional layers use compatible merge modes
  • Ensure no unsupported C++ types in layer configurations
  • Increase clock period (reduce frequency)
  • Increase clock uncertainty tolerance
  • Increase reuse factors
  • Reduce precision where possible
  • Review synthesis reports for critical paths
  • Use io_stream for large models
  • Increase reuse factors
  • Enable BRAM for weights
  • Use resource strategy instead of latency
  • Consider model pruning/quantization

Example: Complete Workflow

import hls4ml
from tensorflow import keras
import numpy as np

# Load model
model = keras.models.load_model('my_model.h5')

# Create configuration
config = hls4ml.utils.config_from_keras_model(model, granularity='name')
config['Model']['Strategy'] = 'Resource'
config['Model']['ReuseFactor'] = 16

# Convert to Vitis HLS
hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=config,
    output_dir='vitis_prj',
    backend='Vitis',
    part='xcvu9p-flga2104-2L-e',
    clock_period=5,
    io_type='io_parallel'
)

# Compile and build
hls_model.compile()
X_test = np.random.rand(100, 784)
y_keras = model.predict(X_test)
y_hls = hls_model.predict(X_test)

# Build HLS project
report = hls_model.build(
    csim=True,
    synth=True,
    cosim=False,
    export=True
)

print(f"Latency: {report['LatencyBest']} - {report['LatencyWorst']} cycles")
print(f"II: {report['II']}")
print(f"Resources: LUT={report['LUT']}, FF={report['FF']}, DSP={report['DSP']}, BRAM={report['BRAM']}")

Vivado Backend

Legacy Vivado HLS backend

Optimization Guide

Performance tuning strategies

FIFO Depth

Optimize streaming dataflow

Profiling & Tracing

Analyze model performance

Build docs developers (and LLMs) love