countTokensAscii
Count the total number of tokens in ASCII text using SIMD-accelerated native implementation.The ASCII text to tokenize
Total number of tokens in the text
tokenizeAsciiNative
Tokenize ASCII text into an array of lowercase tokens using native implementation.The ASCII text to tokenize
Array of lowercase tokens extracted from the text
Notes
- Tokens are automatically converted to lowercase
- Uses SIMD vectorization for high performance
- Optimized for ASCII text; may not handle Unicode correctly
- Punctuation is typically filtered out during tokenization