Overview
bun_nltk provides a fast Naive Bayes text classifier optimized with native code for high-performance text categorization tasks.Quick Start
Basic Training and Classification
Training Options
Smoothing Parameter
1.0: Laplace (add-one) smoothing0.5: Lidstone smoothing0.01: Minimal smoothing for large datasets
Incremental Training
Classification Methods
Single Label Prediction
Probability Scores
Get scores for all classes:Get Available Labels
Model Evaluation
Test Set Evaluation
Cross-Validation Example
Practical Examples
Sentiment Analysis
Spam Detection
Topic Classification
Model Persistence
Serialize Model
Deserialize Model
Serialization Format
Working with Corpus Data
Train from Corpus
Performance Optimization
The classifier uses native code for:- Tokenization (ASCII-optimized)
- Probability calculation
- Batch scoring
Batch Classification
Type Definitions
API Reference
trainNaiveBayesTextClassifier(examples, options?)
Creates and trains a classifier.
Parameters:
examples:NaiveBayesExample[]- Training dataoptions.smoothing?:number- Smoothing parameter (default: 1.0)
NaiveBayesTextClassifier
NaiveBayesTextClassifier Methods
train(examples)- Add training examplesclassify(text)- Predict single labelpredict(text)- Get all label scoreslabels()- Get all known labelsevaluate(examples)- Test accuracytoJSON()- Serialize model
loadNaiveBayesTextClassifier(payload)
Deserializes a saved model.
Parameters:
payload:NaiveBayesSerialized- Serialized model
NaiveBayesTextClassifier