Skip to main content

Overview

The HashSet class provides a set data structure that stores unique elements. It is implemented using the HashTable class internally and provides O(1) average-case performance for insertions, deletions, and membership tests.

Constructor

capacity
int
default:"10"
The initial capacity of the underlying hash table
from dsa.hashset import HashSet

# Create a hash set with default capacity
hs = HashSet()

# Create a hash set with custom capacity
hs = HashSet(capacity=50)

Methods

add

Add an item to the set. If the item already exists, this operation has no effect.
add(item)
item
any
The item to add to the set
hs = HashSet()
hs.add("apple")
hs.add("banana")
hs.add("apple")  # No effect, already exists

remove

Remove an item from the set.
remove(item)
item
any
The item to remove from the set
Raises: KeyError if the item is not found in the set
hs = HashSet()
hs.add("apple")
hs.remove("apple")  # Removes the item

try:
    hs.remove("missing")
except KeyError as e:
    print(f"Item not found: {e}")

contains

Check if an item is in the set.
contains(item) -> bool
item
any
The item to check for
return
bool
True if the item is in the set, False otherwise
hs = HashSet()
hs.add("apple")
print(hs.contains("apple"))  # True
print(hs.contains("banana"))  # False

to_list

Convert the hash set to a list.
to_list() -> list
return
list
A list containing all items in the set
hs = HashSet()
hs.add("apple")
hs.add("banana")
hs.add("cherry")

items = hs.to_list()
print(items)  # ['apple', 'banana', 'cherry'] (order may vary)

Class Methods

from_list

Construct a hash set from an iterable.
@classmethod
from_list(iterable=None) -> HashSet
iterable
iterable
default:"None"
An optional iterable of initial elements
return
HashSet
A new HashSet instance containing the elements from the iterable
# Create from list
hs = HashSet.from_list(["apple", "banana", "cherry"])

# Create from tuple
hs = HashSet.from_list(("x", "y", "z"))

# Create from generator
hs = HashSet.from_list(x for x in range(10))

# Create empty set
hs = HashSet.from_list()

Special Methods

__len__

Get the number of items in the set.
hs = HashSet()
hs.add("apple")
hs.add("banana")
print(len(hs))  # 2

__contains__

Check if an item is in the set using the in operator.
hs = HashSet()
hs.add("apple")
print("apple" in hs)  # True
print("banana" in hs)  # False

__iter__

Iterate over the items in the set.
hs = HashSet()
hs.add("apple")
hs.add("banana")
hs.add("cherry")

for item in hs:
    print(item)

__eq__

Check if two sets are equal.
hs1 = HashSet()
hs1.add("apple")
hs1.add("banana")

hs2 = HashSet()
hs2.add("banana")
hs2.add("apple")

print(hs1 == hs2)  # True

__repr__

Get a string representation of the set.
hs = HashSet()
hs.add("apple")
hs.add("banana")
print(hs)  # HashSet(['apple', 'banana'])

Complete Example

from dsa.hashset import HashSet

# Create a hash set
hs = HashSet()

# Add items
hs.add("apple")
hs.add("banana")
hs.add("cherry")
hs.add("apple")  # Duplicate, no effect

# Check membership
if "apple" in hs:
    print("Apple is in the set")

# Remove item
hs.remove("banana")

# Get size
print(f"Set size: {len(hs)}")  # 2

# Iterate over items
for item in hs:
    print(item)

# Convert to list
items = hs.to_list()
print(items)

Performance

  • Average Case:
    • Add: O(1)
    • Remove: O(1)
    • Contains: O(1)
  • Worst Case:
    • Add: O(n)
    • Remove: O(n)
    • Contains: O(n)

Notes

  • The HashSet internally uses a HashTable to store items
  • All items are stored as keys in the underlying hashtable with True as the value
  • The set does not maintain insertion order
  • Items must be hashable (the HashTable converts them to strings for hashing)
  • The capacity parameter controls the initial size of the underlying hashtable
  • When creating from an iterable, the capacity is automatically set to len(iterable) * 2 for better performance

Build docs developers (and LLMs) love