Overview
All benchmarks compare bun_nltk’s Zig native implementation against Python NLTK on identical datasets and workloads.Core Operations (64MB Dataset)
Benchmarks usingbench/datasets/synthetic.txt on a 64MB synthetic text corpus.
| Workload | Zig/Bun median sec | Python sec | Faster side | Speedup | Percent faster |
|---|---|---|---|---|---|
Token + unique + ngram + unique ngram (bench:compare) | 2.767 | 10.071 | Zig native | 3.64x | 263.93% |
Top-K PMI collocations (bench:compare:collocations) | 2.090 | 23.945 | Zig native | 11.46x | 1045.90% |
Porter stemming (bench:compare:porter) | 11.942 | 120.101 | Zig native | 10.06x | 905.70% |
WASM token/ngram path (bench:compare:wasm) | 4.150 | 13.241 | Zig WASM | 3.19x | 219.06% |
Native vs Python in wasm suite (bench:compare:wasm) | 1.719 | 13.241 | Zig native | 7.70x | 670.48% |
Sentence tokenizer subset (bench:compare:sentence) | 1.680 | 16.580 | Zig/Bun subset | 9.87x | 886.70% |
Perceptron POS tagger (bench:compare:tagger) | 19.880 | 82.849 | Zig native | 4.17x | 316.75% |
Streaming FreqDist + ConditionalFreqDist (bench:compare:freqdist) | 3.206 | 20.971 | Zig native | 6.54x | 554.17% |
Key Takeaways
Collocation Detection - 11.46x speedup- PMI-based bigram collocation scoring shows the largest performance gain
- Windowed bigram statistics computed in native Zig with minimal allocations
- Porter stemmer implementation benefits from ASCII fast paths
- Native string manipulation avoids Python interpreter overhead
- Punkt-compatible subset with abbreviation learning
- Native implementation with orthographic heuristics
- Streaming
FreqDistandConditionalFreqDistbuilders - Native hash tables with collision-free token ID mapping
- Averaged perceptron tagger with native inference
- Batch prediction with feature vector precomputation
- Combined token counting, unique tokens, n-grams, and unique n-grams
- SIMD fast path for ASCII token counting (x86_64)
Extended Workloads (8MB Dataset)
Specialized benchmarks using an 8MB gate dataset for more complex operations.| Workload | Zig/Bun median sec | Python sec | Faster side | Speedup | Percent faster |
|---|---|---|---|---|---|
Punkt tokenizer default path (bench:compare:punkt) | 0.0848 | 1.3463 | Zig native | 15.87x | 1487.19% |
N-gram LM (Kneser-Ney) score+perplexity (bench:compare:lm) | 0.1324 | 2.8661 | Zig/Bun | 21.64x | 2064.19% |
Regexp chunk parser (bench:compare:chunk) | 0.0024 | 1.5511 | Zig/Bun | 643.08x | 64208.28% |
WordNet lookup + morphy workload (bench:compare:wordnet) | 0.0009 | 0.0835 | Zig/Bun | 91.55x | 9054.67% |
CFG chart parser subset (bench:compare:parser) | 0.0088 | 0.3292 | Zig/Bun | 37.51x | 3651.05% |
Naive Bayes text classifier (bench:compare:classifier) | 0.0081 | 0.0112 | Zig/Bun | 1.38x | 38.40% |
PCFG Viterbi chart parser (bench:compare:pcfg) | 0.0191 | 0.4153 | Zig/Bun | 21.80x | 2080.00% |
MaxEnt text classifier (bench:compare:maxent) | 0.0244 | 0.1824 | Zig/Bun | 7.46x | 646.00% |
Sparse linear logits hot loop (bench:compare:linear) | 0.0024 | 2.0001 | Zig native | 840.54x | 83954.04% |
Decision tree text classifier (bench:compare:decision-tree) | 0.0725 | 0.5720 | Zig/Bun | 7.89x | 688.55% |
Earley parser workload (bench:compare:earley) | 0.1149 | 4.6483 | Zig/Bun | 40.47x | 3947.07% |
Key Takeaways
Sparse Linear Scoring - 840.54x speedup- Native Zig hot loop for sparse matrix operations
- Critical for training linear models (Logistic, SVM)
- Minimal allocations with pre-flattened sparse batches
- Regexp-based IOB chunk tagging
- Native compiled grammar matching
- Native/WASM chunk IOB hot loop
- Synset lookups with packed binary format
- Native morphy inflection recovery
- Relation traversal (hypernyms, hyponyms, antonyms)
- Recognition and parsing for arbitrary CFG grammars
- Non-CNF grammar support
- Chart-based parsing with native data structures
- Bottom-up chart parsing
- Native production rule matching
- Parse tree reconstruction
- Probabilistic context-free grammar
- Viterbi algorithm for best parse
- Native probability computations
- Kneser-Ney interpolated smoothing
- Native ID-based evaluation hot loop
- Batch scoring and perplexity computation
- Full trainable Punkt model
- Native sentence splitting fast path
- Abbreviation and collocation handling
- Text classification with decision trees
- Native tree traversal and splitting
- N-gram feature extraction
- Maximum entropy text classification
- Iterative parameter estimation
- Native sparse feature scoring
- Probabilistic text classification
- Laplace smoothing
- Modest speedup due to simpler algorithm
SIMD Fast Path Comparison
Comparison of SIMD-accelerated paths vs scalar baseline:countTokensAscii: 1.22x speedup (SIMD vs scalar)- Normalization (no stopwords): 2.73x speedup (fast path vs standard)
- x86_64 vectorized token counting
- Scalar fallback for other architectures
- Automatic runtime detection
Running Benchmarks
Single Workload
Extended Workloads
Performance Notes
Sentence Tokenizer: This is a Punkt-compatible subset, not full Punkt parity on arbitrary corpora. The full Punkt tokenizer with trainable models shows 15.87x speedup in
bench:compare:punkt.WordNet: Full WordNet corpus is not bundled by default. A mini WordNet dataset is included. Full corpus can be packed from upstream with
bun run wordnet:pack:official.SIMD: Token counting uses x86_64 SIMD fast path with scalar fallback. Run
bench:compare:simd to measure SIMD impact on your hardware.Next Steps
WASM Performance
Compare WASM vs native vs Python performance
Benchmark Overview
Learn about benchmark methodology