WebAssembly

WebAssembly (WASM) enables near-native performance on the web by compiling C, C++, or Rust to efficient bytecode. Learn how to build, optimize, and integrate WASM modules into JavaScript applications.

Compile C/C++/Rust to WASM

Compile native code to WebAssembly using Emscripten, wasm-pack, or other toolchains.

Compiling C/C++ with Emscripten

Install Emscripten toolchain

Set up the Emscripten compiler infrastructure:

# Clone and install Emscripten
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest
./emsdk activate latest
source ./emsdk_env.sh

Write and compile C/C++ code

Create a C function and compile it to WASM:

// math_utils.c
#include <emscripten.h>

EMSCRIPTEN_KEEPALIVE
double calculate_intensive(double* data, int length) {
  double result = 0.0;
  for (int i = 0; i < length; i++) {
    result += data[i] * data[i];
  }
  return result;
}

Compile with optimization:

emcc math_utils.c -O3 -s WASM=1 -s EXPORTED_FUNCTIONS='["_calculate_intensive"]' \
  -s EXPORTED_RUNTIME_METHODS='["cwrap","ccall"]' -o math_utils.js

The -O3 flag enables aggressive optimizations. Use -Os for size optimization or -O2 for balanced optimization.

Load and use in JavaScript

Import and call the WASM module from JavaScript:

import createModule from './math_utils.js';

const Module = await createModule();
const calculate = Module.cwrap('calculate_intensive', 'number', ['number', 'number']);

// Allocate memory and copy data
const data = new Float64Array([1, 2, 3, 4, 5]);
const ptr = Module._malloc(data.length * 8);
Module.HEAPF64.set(data, ptr / 8);

const result = calculate(ptr, data.length);
Module._free(ptr);

Compiling Rust with wasm-pack

Set up Rust and wasm-pack

Install the Rust toolchain and wasm-pack:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install wasm-pack

Create a Rust WASM project

cargo new --lib wasm-image-processing
cd wasm-image-processing

Update Cargo.toml:

[lib]
crate-type = ["cdylib"]

[dependencies]
wasm-bindgen = "0.2"

Implement image processing in src/lib.rs:

use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn apply_filter(data: &mut [u8], width: u32, height: u32) {
    for y in 1..height-1 {
        for x in 1..width-1 {
            let idx = ((y * width + x) * 4) as usize;
            // Apply convolution filter
            data[idx] = /* filter calculation */;
        }
    }
}

Build and integrate

wasm-pack build --target web

Use in JavaScript:

import init, { apply_filter } from './pkg/wasm_image_processing.js';

await init();
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
apply_filter(imageData.data, canvas.width, canvas.height);
ctx.putImageData(imageData, 0, 0);

Rust’s wasm-bindgen provides safer memory management than raw Emscripten, but adds slight overhead. Choose based on your safety vs. performance requirements.

Linear memory management

WebAssembly uses a contiguous linear memory space that must be carefully managed.

Memory model

WASM memory is a resizable ArrayBuffer
Accessed as typed arrays (HEAP8, HEAP32, HEAPF64)
Growth is in 64KB pages (WebAssembly.Memory.grow())
No automatic garbage collection

Manual memory management

Allocate memory

// From JavaScript side
const ptr = Module._malloc(1024); // Allocate 1KB
const buffer = new Uint8Array(Module.HEAPU8.buffer, ptr, 1024);

Copy data between JS and WASM

// JS -> WASM
const data = new Float32Array([1, 2, 3, 4]);
const ptr = Module._malloc(data.length * 4);
Module.HEAPF32.set(data, ptr / 4);

// WASM -> JS
const result = new Float32Array(
  Module.HEAPF32.buffer,
  resultPtr,
  resultLength
);

Free memory

Module._free(ptr);

Forgetting to free WASM memory causes leaks. WASM memory is not managed by JavaScript’s garbage collector.

Growing memory

// Initialize with 1MB (16 pages)
const memory = new WebAssembly.Memory({ initial: 16 });

// Grow by 10 pages (640KB)
const previousPages = memory.grow(10);

Memory growth invalidates existing TypedArray views. Recreate views after growing memory.

JS ↔ WASM interop optimization

Minimize overhead when crossing the JavaScript/WebAssembly boundary.

Minimize boundary crossings

Batch operations

Instead of calling WASM for each operation, batch work:

// Bad: Multiple boundary crossings
for (let i = 0; i < 1000; i++) {
  result[i] = Module.process(data[i]);
}

// Good: Single boundary crossing
const dataPtr = copyToWASM(data);
const resultPtr = Module.processBatch(dataPtr, data.length);
const result = copyFromWASM(resultPtr, data.length);

Avoid string passing

String encoding/decoding is expensive. Use numeric codes or pass binary data:

// Bad: String passing
Module.processText("hello world");

// Good: Pass as bytes
const encoder = new TextEncoder();
const bytes = encoder.encode("hello world");
const ptr = copyToWASM(bytes);
Module.processBytes(ptr, bytes.length);

Use shared memory for large data

Keep data in WASM memory and pass pointers:

// Allocate once in WASM
const bufferPtr = Module._malloc(BUFFER_SIZE);

// Reuse for multiple operations
Module.fillBuffer(bufferPtr, sourceData);
Module.processBuffer(bufferPtr);
Module.transformBuffer(bufferPtr);

// Read result
const result = new Uint8Array(
  Module.HEAPU8.buffer,
  bufferPtr,
  BUFFER_SIZE
);

Optimize data transfer

Use TypedArrays instead of regular arrays
Align data to memory boundaries (4-byte, 8-byte)
Preallocate buffers and reuse them
Consider SharedArrayBuffer for worker threads

The overhead of JS/WASM calls is typically 5-10ns, but data copying can be much more expensive (1-2 GB/s).

SIMD and threads

Leverage parallel processing capabilities for maximum performance.

SIMD (Single Instruction, Multiple Data)

Process multiple data elements in a single instruction:

#include <wasm_simd128.h>

void add_vectors_simd(float* a, float* b, float* result, int length) {
  int i;
  for (i = 0; i <= length - 4; i += 4) {
    v128_t va = wasm_v128_load(&a[i]);
    v128_t vb = wasm_v128_load(&b[i]);
    v128_t vresult = wasm_f32x4_add(va, vb);
    wasm_v128_store(&result[i], vresult);
  }
  
  // Handle remaining elements
  for (; i < length; i++) {
    result[i] = a[i] + b[i];
  }
}

Enable SIMD compilation

emcc -msimd128 -O3 code.c -o code.wasm

Verify SIMD support

if (!WebAssembly.validate(new Uint8Array([0x00, 0x61, 0x73, 0x6d, ...]))) {
  console.warn('SIMD not supported, falling back to scalar');
}

Benchmark performance gains

SIMD typically provides 2-4x speedup for vectorizable operations.

Threads and SharedArrayBuffer

Parallelize computation across multiple cores:

#include <pthread.h>
#include <emscripten/threading.h>

void* worker_function(void* arg) {
  // Process chunk of data
  return NULL;
}

void parallel_process(float* data, int length) {
  pthread_t threads[4];
  for (int i = 0; i < 4; i++) {
    pthread_create(&threads[i], NULL, worker_function, &data[i * length/4]);
  }
  for (int i = 0; i < 4; i++) {
    pthread_join(threads[i], NULL);
  }
}

Compile with threading:

emcc -pthread -s USE_PTHREADS=1 -s PTHREAD_POOL_SIZE=4 code.c -o code.js

WebAssembly threads require SharedArrayBuffer, which has strict cross-origin isolation requirements (COOP and COEP headers).

Project: Build WASM module for computationally intensive task

Choose a compute-intensive task

Select a task that benefits from native performance:

Image processing (filters, transformations)
Physics simulation (collision detection, particle systems)
Cryptography (hashing, encryption)
Data compression/decompression

Implement in C/C++ or Rust

Write optimized native code with SIMD where applicable.

Optimize JS/WASM boundary

Minimize data copying and boundary crossings. Batch operations and use shared memory.

Benchmark vs pure JavaScript

Measure performance improvement:

console.time('JS Implementation');
jsImplementation(data);
console.timeEnd('JS Implementation');

console.time('WASM Implementation');
wasmImplementation(data);
console.timeEnd('WASM Implementation');

Target 5-10x speedup for CPU-bound operations.

Add threading for multi-core utilization

Parallelize work across available CPU cores for additional speedup.

WebAssembly provides the most benefit for CPU-intensive computations. For I/O-bound or DOM-heavy operations, the overhead may outweigh the benefits.

Profiling & Optimization

Advanced Topics

Compile C/C++/Rust to WASM

Compiling C/C++ with Emscripten

Compiling Rust with wasm-pack

Linear memory management

Memory model

Manual memory management

Growing memory

JS ↔ WASM interop optimization

Minimize boundary crossings

Optimize data transfer

SIMD and threads

SIMD (Single Instruction, Multiple Data)

Threads and SharedArrayBuffer

Project: Build WASM module for computationally intensive task

Build docs developers (and LLMs) love

Profiling & Optimization

Advanced Topics

​Compile C/C++/Rust to WASM

​Compiling C/C++ with Emscripten

​Compiling Rust with wasm-pack

​Linear memory management

​Memory model

​Manual memory management

​Growing memory

​JS ↔ WASM interop optimization

​Minimize boundary crossings

​Optimize data transfer

​SIMD and threads

​SIMD (Single Instruction, Multiple Data)

​Threads and SharedArrayBuffer

​Project: Build WASM module for computationally intensive task

Build docs developers (and LLMs) love

Compile C/C++/Rust to WASM

Compiling C/C++ with Emscripten

Compiling Rust with wasm-pack

Linear memory management

Memory model

Manual memory management

Growing memory

JS ↔ WASM interop optimization

Minimize boundary crossings

Optimize data transfer

SIMD and threads

SIMD (Single Instruction, Multiple Data)

Threads and SharedArrayBuffer

Project: Build WASM module for computationally intensive task