Synsets

Overview

Synsets are the core building blocks of WordNet - sets of cognitive synonyms that represent a single concept. Each synset contains lemmas (word forms), definitions (glosses), examples, and relationships to other synsets.

WordNetSynset Type

type WordNetSynset = {
  id: string;              // Unique synset identifier (e.g., "dog.n.01")
  pos: WordNetPos;         // Part of speech: "n" | "v" | "a" | "r"
  lemmas: string[];        // Word forms in this synset (normalized)
  gloss: string;           // Definition text
  examples: string[];      // Usage examples
  hypernyms: string[];     // More general synset IDs
  hyponyms: string[];      // More specific synset IDs
  similarTo: string[];     // Similar synset IDs (mainly for adjectives)
  antonyms: string[];      // Opposite synset IDs
};

type WordNetPos = "n" | "v" | "a" | "r";
// n = noun, v = verb, a = adjective, r = adverb

Methods

synset

Looks up a synset by its unique identifier.

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

const synset = wordnet.synset('dog.n.01');
if (synset) {
  console.log(synset.gloss);       // "a member of the genus Canis..."
  console.log(synset.lemmas);      // ["dog", "domestic_dog", "canis_familiaris"]
  console.log(synset.pos);         // "n"
}

// Returns null if not found
const missing = wordnet.synset('nonexistent.n.99');
console.log(missing);              // null

Parameters:

id: string - Synset identifier in format word.pos.number

Returns:

WordNetSynset | null - Synset object if found, null otherwise

Details:

Performs O(1) lookup using internal Map
Case-sensitive ID matching
Does not normalize or validate the ID format

Finds all synsets containing a given word, optionally filtered by part of speech.

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

// Get all synsets for "run" (noun, verb, etc.)
const allRun = wordnet.synsets('run');
console.log(allRun.length);        // Multiple senses across POS

// Get only verb synsets for "run"
const runVerbs = wordnet.synsets('run', 'v');
console.log(runVerbs[0]?.gloss);   // "move fast by using one's feet..."

// Get only noun synsets
const runNouns = wordnet.synsets('run', 'n');
console.log(runNouns[0]?.gloss);   // "a score in baseball..."

// Handles inflected forms via morphy
const running = wordnet.synsets('running', 'v');
console.log(running.length > 0);   // true (finds "run" base form)

Parameters:

word: string - Word to look up (can be inflected)
pos?: WordNetPos - Optional part of speech filter: “n”, “v”, “a”, or “r”

Returns:

WordNetSynset[] - Array of matching synsets (empty if none found)

Details:

Automatically applies morphological analysis via morphy() (see Morphy)
Normalizes word to lowercase and replaces spaces with underscores
Without POS filter, returns synsets across all parts of speech
Returns empty array if word not found or has no synsets for specified POS
Performs O(1) lemma lookup after normalization

lemmas

Returns all lemmas (word forms) in the database, optionally filtered by part of speech.

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

// Get all lemmas (returns thousands of words)
const allLemmas = wordnet.lemmas();
console.log(allLemmas.length);     // e.g., 10000+

// Get only noun lemmas
const nouns = wordnet.lemmas('n');
console.log(nouns.slice(0, 3));    // ["aardvark", "abacus", "abandon"]

// Get only verb lemmas
const verbs = wordnet.lemmas('v');
console.log(verbs.slice(0, 3));    // ["abandon", "abbreviate", "abdicate"]

// Get only adjective lemmas
const adjectives = wordnet.lemmas('a');

// Get only adverb lemmas
const adverbs = wordnet.lemmas('r');

Parameters:

pos?: WordNetPos - Optional part of speech filter: “n”, “v”, “a”, or “r”

Returns:

string[] - Alphabetically sorted array of lemmas

Details:

Returns normalized forms (lowercase, underscores for spaces)
Without POS filter, includes all lemmas across all parts of speech
Results are automatically sorted alphabetically
A lemma may appear once but have multiple synsets
Performance: O(n) where n is the number of unique lemmas

WordNet Class

The WordNet class maintains two internal indexes for efficient lookups:

export class WordNet {
  private readonly byId = new Map<string, WordNetSynset>();
  private readonly lemmaIndex = new Map<string, WordNetSynset[]>();

  // ... methods
}

Internal Structure:

byId - Maps synset IDs to synset objects (for synset() method)
lemmaIndex - Maps normalized lemmas to arrays of synsets (for synsets() method)

Normalization: All lemmas are normalized during construction:

Converted to lowercase
Spaces replaced with underscores
Example: “New York” → “new_york”

Usage Examples

Exploring Word Senses

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

// Find all senses of "bank"
const senses = wordnet.synsets('bank', 'n');

senses.forEach((synset, index) => {
  console.log(`Sense ${index + 1}: ${synset.gloss}`);
  console.log(`  Lemmas: ${synset.lemmas.join(', ')}`);
  if (synset.examples.length > 0) {
    console.log(`  Example: ${synset.examples[0]}`);
  }
});

// Output:
// Sense 1: a financial institution...
//   Lemmas: bank, banking_company
// Sense 2: sloping land beside a body of water
//   Lemmas: bank, riverbank
// ...

Checking Word Existence

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

function isInWordNet(word: string, pos?: WordNetPos): boolean {
  return wordnet.synsets(word, pos).length > 0;
}

console.log(isInWordNet('dog'));           // true
console.log(isInWordNet('xyzabc'));        // false
console.log(isInWordNet('run', 'v'));      // true
console.log(isInWordNet('run', 'a'));      // false (run is not an adjective)

Iterating Through Vocabulary

import { loadWordNetMini } from 'bun_nltk';

const wordnet = loadWordNetMini();

// Count synsets per POS
const stats = {
  nouns: wordnet.lemmas('n').length,
  verbs: wordnet.lemmas('v').length,
  adjectives: wordnet.lemmas('a').length,
  adverbs: wordnet.lemmas('r').length,
};

console.log('WordNet Statistics:', stats);

// Find polysemous words (words with multiple senses)
const polysemous = wordnet.lemmas('n')
  .filter(lemma => wordnet.synsets(lemma, 'n').length > 5)
  .slice(0, 10);

console.log('Highly polysemous nouns:', polysemous);

Working with Synset Objects

import { loadWordNetMini, type WordNetSynset } from 'bun_nltk';

const wordnet = loadWordNetMini();

function describeSynset(synset: WordNetSynset): string {
  const parts = [
    `ID: ${synset.id}`,
    `POS: ${synset.pos}`,
    `Lemmas: ${synset.lemmas.join(', ')}`,
    `Definition: ${synset.gloss}`,
  ];
  
  if (synset.examples.length > 0) {
    parts.push(`Examples: ${synset.examples.join('; ')}`);
  }
  
  return parts.join('\n');
}

const dogSynset = wordnet.synsets('dog', 'n')[0];
if (dogSynset) {
  console.log(describeSynset(dogSynset));
}

Loading - Load WordNet databases
Relations - Navigate semantic relationships
Morphy - Morphological analysis and lemmatization

Tokenization

Text Processing

Tagging & Analysis

Language Models

Parsing

Classification

WordNet

Corpus

WASM Runtime

Native APIs

Synsets

Overview

WordNetSynset Type

Methods

synset

synsets

lemmas

WordNet Class

Usage Examples

Exploring Word Senses

Checking Word Existence

Iterating Through Vocabulary

Working with Synset Objects

Build docs developers (and LLMs) love

Tokenization

Text Processing

Tagging & Analysis

Language Models

Parsing

Classification

WordNet

Corpus

WASM Runtime

Native APIs

​Overview

​WordNetSynset Type

​Methods

​synset

​synsets

​lemmas

​WordNet Class

​Usage Examples

​Exploring Word Senses

​Checking Word Existence

​Iterating Through Vocabulary

​Working with Synset Objects

​Related

Build docs developers (and LLMs) love

Overview

WordNetSynset Type

Methods

synset

synsets

lemmas

WordNet Class

Usage Examples

Exploring Word Senses

Checking Word Existence

Iterating Through Vocabulary

Working with Synset Objects

Related