Overview
TheLeetCodeRetriever class provides fast semantic search over LeetCode solutions using FAISS HNSW indexing and SentenceTransformers embeddings. It supports similarity search and metadata-based filtering by company, difficulty, and topics.
Class Definition
Solution Dataclass
Represents a LeetCode solution with associated metadata.The problem title as it appears on LeetCode
Complete solution text including approach explanation, code implementation, and complexity analysis
Problem difficulty level:
"Easy", "Medium", or "Hard"Comma-separated list of algorithmic topics and data structures
Comma-separated list of companies known to ask this problem
Constructor Parameters
Path to the FAISS HNSW index file. Defaults to
leetcode_hnsw2.index in the component directory.Path to the pickled metadata file containing Solution objects. Defaults to
leetcode_metadata2.pkl in the component directory.SentenceTransformer model name for encoding queries. Must match the model used to build the index.
HNSW search parameter controlling speed/accuracy trade-off. Higher values improve accuracy but slow down search. Typical range: 16-128.
- Loads the SentenceTransformer encoder
- Reads the FAISS HNSW index from disk
- Validates that the index is HNSW type
- Sets HNSW search parameters
- Loads solution metadata from pickle file
- Logs successful initialization
ValueErrorif the index is not an HNSW indexExceptionif metadata file cannot be loaded
Methods
search
Search for semantically similar solutions using vector similarity.Natural language query describing the problem or concept
Number of top results to return
If
True, returns tuples of (Solution, score). If False, returns only Solution objects.List of search results. Format depends on
return_scores:- If
True: List of(Solution, score)tuples, where score is L2 distance (lower is better) - If
False: List ofSolutionobjects only
- Encodes query using SentenceTransformer
- Searches FAISS HNSW index for k nearest neighbors
- Returns solutions ordered by similarity (ascending L2 distance)
- Lower scores indicate higher similarity
filter_by_metadata
Filter solutions based on company, difficulty, and topic metadata.List of company names to filter by. Matches if any company is found in solution’s companies field (case-insensitive).
Difficulty level to filter by:
"Easy", "Medium", or "Hard" (case-insensitive)List of topics to filter by. Matches if any topic is found in solution’s topics field (case-insensitive).
List of Solution objects matching all specified criteria. Returns all solutions if no filters specified.
- Applies filters sequentially (companies → difficulty → topics)
- All filters use case-insensitive matching
- Filters are cumulative (AND logic)
- Within each filter type, matching uses OR logic (e.g., any company matches)
- Returns original solution list if no filters provided
_load_metadata
Internal method to load solution metadata from pickle file.Path to pickled metadata file
List of Solution objects loaded from file
Exceptionon file loading or unpickling errors
Attributes
SentenceTransformer model for encoding text queries into embeddings
FAISS HNSW index for fast approximate nearest neighbor search
Complete list of Solution objects with metadata
Usage Examples
Performance Characteristics
Search Complexity
Time: O(log n) average
Space: O(k) for results
HNSW enables sub-linear search time
Space: O(k) for results
HNSW enables sub-linear search time
Filter Complexity
Time: O(n) linear scan
Space: O(m) filtered results
Iterates through all solutions
Space: O(m) filtered results
Iterates through all solutions
ef_search Parameter Tuning
| ef_search | Speed | Accuracy | Use Case |
|---|---|---|---|
| 16 | Fastest | Lower | Real-time autocomplete |
| 32 | Fast | Good | Default for most cases |
| 64 | Moderate | Better | High-quality results |
| 128 | Slower | Best | Maximum accuracy needed |
Error Handling
Integration Example
See Also
- RAGEngine - Uses retriever for context retrieval
- PromptTemplates - Formats retrieved solutions