Skip to main content

Overview

Arcana provides three search modes, each optimized for different query types:

Semantic

Vector similarity using embeddings. Best for conceptual queries.

Full-text

PostgreSQL text search. Best for exact terms and names.

Hybrid

Combines both using Reciprocal Rank Fusion. Best overall quality.

Semantic Search (Vector Similarity)

How it works: Converts your query to an embedding, then finds chunks with the closest embeddings using cosine similarity.
# Default search mode
{:ok, results} = Arcana.search(
  "What is functional programming?",
  repo: MyApp.Repo,
  mode: :semantic,  # default, can be omitted
  limit: 10
)

Implementation Details

From lib/arcana/search.ex:228-246:
defp do_search(:semantic, query, params) do
  # 1. Embed the query
  case Embedder.embed(Arcana.Config.embedder(), query, intent: :query) do
    {:ok, query_embedding} ->
      # 2. Search vector store with cosine similarity
      results = VectorStore.search(
        params.collection, 
        query_embedding,
        limit: params.limit,
        threshold: params.threshold
      )
      
      {:ok, transform_results(results)}
      
    {:error, reason} ->
      {:error, {:embedding_failed, reason}}
  end
end
PostgreSQL query (from lib/arcana/vector_store/pgvector.ex:97-122):
SELECT 
  c.id,
  c.text,
  c.chunk_index,
  c.document_id,
  1 - (c.embedding <=> $1) AS score
FROM arcana_chunks c
JOIN arcana_documents d ON c.document_id = d.id
WHERE 1 - (c.embedding <=> $1) > $2  -- threshold filter
ORDER BY c.embedding <=> $1          -- closest vectors first
LIMIT $3
Best for: Understanding, explanations, how-to queries
# These queries benefit from semantic understanding
queries = [
  "How does garbage collection work?",
  "Explain the actor model",
  "Benefits of immutability",
  "Difference between processes and threads"
]

Enum.each(queries, fn query ->
  {:ok, results} = Arcana.search(query, 
    repo: MyApp.Repo,
    mode: :semantic
  )
end)
Why it works: Semantic search understands that “actor model” relates to “concurrent processes”, “message passing”, and “supervision trees” even if those exact words aren’t in the query.
Best for: Queries with common synonyms
# Semantic search matches synonyms automatically
{:ok, results} = Arcana.search(
  "ML algorithms",  # query uses "ML"
  repo: MyApp.Repo
)

# Returns chunks containing:
# ✅ "machine learning models"
# ✅ "neural networks"
# ✅ "deep learning approaches"
# Even though none contain "ML algorithms" exactly

Limitations

Semantic search may miss:
  • Exact technical terms (“PostgreSQL”, “OAuth2.0”)
  • Proper names (“John Smith”, “Arcana”)
  • Code symbols (function names, variables)
  • Acronyms that aren’t common (“HNSW”, “RRF”)
Use hybrid search for these cases.
How it works: Uses PostgreSQL’s built-in tsvector and tsquery for keyword matching with linguistic features like stemming and stop word removal.
{:ok, results} = Arcana.search(
  "PostgreSQL pgvector",
  repo: MyApp.Repo,
  mode: :fulltext,
  limit: 10
)

Implementation Details

From lib/arcana/search.ex:249-261:
defp do_search(:fulltext, query, params) do
  # No embedding needed - direct text search
  results = VectorStore.search_text(
    params.collection,
    query,
    limit: params.limit,
    source_id: params.source_id,
    repo: params.repo
  )
  
  {:ok, transform_results(results)}
end
PostgreSQL query (from lib/arcana/vector_store/pgvector.ex:139-181):
SELECT 
  c.id,
  c.text,
  c.chunk_index,
  c.document_id,
  ts_rank(to_tsvector('english', c.text), 
          plainto_tsquery('english', $1)) AS score
FROM arcana_chunks c
JOIN arcana_documents d ON c.document_id = d.id
WHERE to_tsvector('english', c.text) @@ 
      plainto_tsquery('english', $1)
ORDER BY score DESC
LIMIT $2

PostgreSQL Text Search Features

Automatically matches word variations:
-- Query: "running"
-- Matches:
SELECT to_tsvector('english', 'run')      -- ✅ matched
SELECT to_tsvector('english', 'running')  -- ✅ matched
SELECT to_tsvector('english', 'runs')     -- ✅ matched
SELECT to_tsvector('english', 'runner')   -- ✅ matched
All these words have the same stem: “run”
Best for: Technical terms, product names, acronyms
# These need exact matching
queries = [
  "PostgreSQL 16",
  "OAuth2.0 flow",
  "HNSW index",
  "Arcana.search/2"
]

Enum.each(queries, fn query ->
  {:ok, results} = Arcana.search(query,
    repo: MyApp.Repo,
    mode: :fulltext
  )
end)
Best for: People, companies, products
{:ok, results} = Arcana.search(
  "José Valim",  # Person's name
  repo: MyApp.Repo,
  mode: :fulltext
)

# Semantic search might match "Elixir creator" or "language author"
# Full-text search finds the exact name
Best for: Function names, module paths
{:ok, results} = Arcana.search(
  "Arcana.Embedder.Local.embed/2",
  repo: MyApp.Repo,
  mode: :fulltext
)

# Finds exact function documentation

Limitations

Full-text search misses:
  • Semantic relationships (“car” won’t match “automobile”)
  • Conceptual queries (“how does X work?”)
  • Paraphrased content
Use hybrid search for best results.

Hybrid Search (Reciprocal Rank Fusion)

How it works: Runs both semantic and full-text searches, then combines results using Reciprocal Rank Fusion (RRF) for better overall quality.
{:ok, results} = Arcana.search(
  "PostgreSQL pgvector setup",
  repo: MyApp.Repo,
  mode: :hybrid,
  limit: 10
)

Two Implementation Approaches

Arcana uses different strategies depending on the vector store backend:
Recommended: Computes both scores in a single database query.From lib/arcana/vector_store/pgvector.ex:206-317:
WITH base_scores AS (
  -- Compute both semantic and fulltext scores
  SELECT
    c.id, c.text, c.chunk_index, c.document_id,
    1 - (c.embedding <=> $1) AS semantic_score,
    COALESCE(
      ts_rank(to_tsvector('english', c.text), 
              plainto_tsquery('english', $2)), 
      0
    ) AS fulltext_score
  FROM arcana_chunks c
  JOIN arcana_documents d ON c.document_id = d.id
),
score_bounds AS (
  -- Normalize fulltext scores to 0-1 range
  SELECT MIN(fulltext_score) AS min_ft,
         MAX(fulltext_score) AS max_ft
  FROM base_scores
),
normalized AS (
  SELECT bs.*,
    CASE
      WHEN sb.max_ft = sb.min_ft THEN 0
      ELSE (bs.fulltext_score - sb.min_ft) / 
           (sb.max_ft - sb.min_ft)
    END AS fulltext_normalized
  FROM base_scores bs, score_bounds sb
)
SELECT *,
  ($5 * semantic_score + $6 * fulltext_normalized) AS hybrid_score
FROM normalized
WHERE ($5 * semantic_score + $6 * fulltext_normalized) > $7
ORDER BY hybrid_score DESC
LIMIT $8
Configuration:
{:ok, results} = Arcana.search(
  "query",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.7,   # 70% semantic
  fulltext_weight: 0.3    # 30% fulltext
)

Why RRF Works

Reciprocal Rank Fusion combines rankings from multiple sources:
Query: "PostgreSQL pgvector setup"

Semantic results (by similarity):      Full-text results (by ts_rank):
1. "Setup pgvector extension"   0.89  1. "PostgreSQL pgvector install"  0.95
2. "Installing vector search"   0.82  2. "Setup pgvector extension"     0.88
3. "PostgreSQL configuration"   0.75  3. "Vector database setup"        0.71
4. "Database setup guide"       0.68  4. "PostgreSQL best practices"    0.65

RRF scores (k=60):
- "Setup pgvector extension": 1/(60+1) + 1/(60+2) = 0.0326  ✅ #1
- "PostgreSQL pgvector install": 0 + 1/(60+1) = 0.0164     #2
- "Installing vector search": 1/(60+2) + 0 = 0.0161        #3
- "PostgreSQL configuration": 1/(60+3) + 0 = 0.0159        #4
- "Vector database setup": 0 + 1/(60+3) = 0.0159          #5
Benefits:
  • Items ranking well in both lists get boosted
  • Reduces impact of outliers from either method
  • No need to normalize scores (rank-based)
Best for: Queries with both concepts and specific terms
# These benefit from both semantic and keyword matching
queries = [
  "PostgreSQL pgvector performance tuning",
  # semantic: "performance tuning"
  # fulltext: "PostgreSQL pgvector"
  
  "How to configure OAuth2.0 authentication?",
  # semantic: "how to configure", "authentication"
  # fulltext: "OAuth2.0"
  
  "Elixir GenServer process lifecycle",
  # semantic: "process lifecycle"
  # fulltext: "Elixir GenServer"
]

Enum.each(queries, fn query ->
  {:ok, results} = Arcana.search(query,
    repo: MyApp.Repo,
    mode: :hybrid
  )
end)
Best for: User-generated queries where you can’t predict the type
# Production search endpoint
def search(conn, %{"q" => query}) do
  # Don't know if query is conceptual or keyword-based
  # Use hybrid for best overall results
  {:ok, results} = Arcana.search(query,
    repo: MyApp.Repo,
    mode: :hybrid
  )
  
  json(conn, %{results: results})
end
Best for: Critical applications where quality matters most
# Legal document search - can't miss relevant results
{:ok, results} = Arcana.search(
  "contract termination clause",
  repo: MyApp.Repo,
  mode: :hybrid,
  limit: 20,
  threshold: 0.3  # Lower threshold for recall
)

Adjusting Hybrid Weights

Weight tuning only available with pgvector backend. RRF uses fixed rank-based scoring.
# Favor semantic understanding (conceptual queries)
{:ok, results} = Arcana.search(
  "explain machine learning",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.8,  # 80% semantic
  fulltext_weight: 0.2   # 20% fulltext
)

# Favor keyword matching (technical queries)
{:ok, results} = Arcana.search(
  "Arcana.Embedder.Local.embed/2",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.3,  # 30% semantic
  fulltext_weight: 0.7   # 70% fulltext
)

# Balanced (default)
{:ok, results} = Arcana.search(
  "query",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.5,
  fulltext_weight: 0.5
)

Performance Characteristics

Benchmarks (10K chunks, pgvector with IVFFlat index):
ModeLatency (p50)Latency (p99)Notes
Semantic~15ms~35msSingle index scan
Full-text~8ms~20msGIN index lookup
Hybrid (pgvector)~25ms~50msSingle query, both indexes
Hybrid (RRF)~30ms~60msTwo separate queries
Key factors:
  • Vector dimensions (384 faster than 1536)
  • Index type (IVFFlat vs HNSW)
  • Result limit (higher = slower)

Real-World Examples

Choosing the Right Mode

1

Analyze Query Patterns

Look at your typical queries:
# Conceptual → Semantic
"How does X work?"
"What is the difference between X and Y?"
"Benefits of X"

# Exact terms → Full-text
"Arcana.search/2"
"PostgreSQL 16"
"OAuth2.0"

# Mixed → Hybrid
"How to configure OAuth2.0?"
"PostgreSQL performance tuning"
"Elixir GenServer examples"
2

Measure with Evaluation

Use Arcana’s built-in evaluation tools:
# Create test cases
test_cases = [
  %{query: "What is OTP?", expected_chunk_ids: [...]},
  %{query: "GenServer.call/3", expected_chunk_ids: [...]},
  # ... more cases
]

# Test each mode
[:semantic, :fulltext, :hybrid]
|> Enum.each(fn mode ->
  metrics = Arcana.Evaluation.run(test_cases, mode: mode)
  IO.inspect({mode, metrics})
end)

# Compare MRR, Recall, Precision
See Evaluation Guide for details.
3

Consider Performance

High traffic? → Use semantic (fastest)Quality critical? → Use hybrid (best results)Exact matching needed? → Use full-text
4

Start with Hybrid

When in doubt, use hybrid mode:
# Production default
config :my_app,
  search_mode: :hybrid,
  semantic_weight: 0.6,
  fulltext_weight: 0.4

# Then optimize per use case
def search(query, opts) do
  mode = Keyword.get(opts, :mode, :hybrid)
  Arcana.search(query, mode: mode, repo: MyApp.Repo)
end

Best Practices

Use Thresholds

Filter low-quality results:
{:ok, results} = Arcana.search(query,
  repo: MyApp.Repo,
  threshold: 0.7  # Only scores above 0.7
)

Limit Results

Return only what you need:
# Simple questions
limit: 3

# Complex questions
limit: 10

# Avoid
limit: 50  # Too much context

Test Both Indexes

Ensure indexes are created:
-- Vector index (pgvector)
CREATE INDEX ON arcana_chunks 
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- Fulltext index
CREATE INDEX ON arcana_chunks 
USING gin (to_tsvector('english', text));

Monitor Performance

Use telemetry to track search latency:
:telemetry.attach(
  "search-metrics",
  [:arcana, :search, :stop],
  &handle_search_metrics/4,
  nil
)

Next Steps

RAG Pipeline

Understand the complete RAG workflow

Embeddings

Learn how semantic search works with vector embeddings

Evaluation

Measure and compare search quality metrics

Re-ranking

Improve search results with re-ranking

Build docs developers (and LLMs) love