Skip to main content
Arcana emits telemetry events for all operations, giving you visibility into performance, errors, and usage patterns.

Quick Start

For immediate visibility, attach the built-in logger:
lib/my_app/application.ex
def start(_type, _args) do
  # Attach telemetry logger before starting supervision tree
  Arcana.Telemetry.Logger.attach()

  children = [
    MyApp.Repo,
    Arcana.TaskSupervisor,
    Arcana.Embedder.Local
  ]

  Supervisor.start_link(children, strategy: :one_for_one)
end
This logs all Arcana operations with timing:
[info] [Arcana] search completed in 42ms (15 results)
[info] [Arcana] llm.complete completed in 1.23s [openai:gpt-4o-mini] ok (156 chars)
[info] [Arcana] agent.gate completed in 180ms (skip_retrieval: false)
[info] [Arcana] agent.expand completed in 2.15s ("machine learning ML models...")
[info] [Arcana] agent.search completed in 156ms (25 chunks)
[info] [Arcana] agent.rerank completed in 312ms (10/25 kept)
[info] [Arcana] agent.answer completed in 3.25s
[info] [Arcana] ask completed in 6.12s
With graph: true enabled:
[info] [Arcana] graph.ner completed in 45ms (3 entities)
[info] [Arcana] graph.relationship_extraction completed in 1.2s (2 relationships)
[info] [Arcana] graph.build completed in 1.5s (5 entities, 3 relationships)
[info] [Arcana] graph.search completed in 28ms (8 graph results, 10 combined)

Logger Options

Arcana.Telemetry.Logger.attach(
  level: :debug,           # Log level (default: :info)
  handler_id: "my-logger"  # Custom handler ID
)

# To stop logging
Arcana.Telemetry.Logger.detach()

Event Reference

All events use :telemetry.span/3, which emits :start, :stop, and :exception variants automatically.

Core Events

EventMeasurementsMetadata
[:arcana, :ingest, :*]system_time, durationtext, repo, collection, document, chunk_count
[:arcana, :search, :*]system_time, durationquery, repo, mode, limit, results, result_count
[:arcana, :ask, :*]system_time, durationquestion, repo, answer, context_count
[:arcana, :embed, :*]system_time, durationtext, dimensions
[:arcana, :llm, :complete, :*]system_time, durationmodel, prompt_length, success, response_length, error

Agent Pipeline Events

Each step in the Agent pipeline emits its own events:
EventMetadata
[:arcana, :agent, :gate, :*]question, skip_retrieval
[:arcana, :agent, :rewrite, :*]question, rewritten_query
[:arcana, :agent, :select, :*]selected (collections)
[:arcana, :agent, :expand, :*]question, expanded_query
[:arcana, :agent, :decompose, :*]question, sub_question_count
[:arcana, :agent, :search, :*]question, total_chunks
[:arcana, :agent, :reason, :*]question, iterations
[:arcana, :agent, :rerank, :*]question, chunks_before, chunks_after
[:arcana, :agent, :answer, :*]question, context_chunk_count

GraphRAG Events

When using graph: true, these events track knowledge graph operations:
EventMetadata
[:arcana, :graph, :build, :*]chunk_count, collection, entity_count, relationship_count
[:arcana, :graph, :search, :*]query, entity_count, graph_result_count, combined_count
[:arcana, :graph, :ner, :*]text, entity_count
[:arcana, :graph, :relationship_extraction, :*]text, relationship_count
[:arcana, :graph, :community_detection, :*]entity_count, community_count
[:arcana, :graph, :community_summary, :*]entity_count, summary_length

VectorStore and GraphStore Events

Storage layer events for vector and graph operations:
EventMetadata
[:arcana, :vector_store, :store, :*]collection, id, backend
[:arcana, :vector_store, :search, :*]collection, limit, backend, result_count
[:arcana, :vector_store, :search_text, :*]collection, query, limit, backend, result_count
[:arcana, :vector_store, :delete, :*]collection, id, backend
[:arcana, :vector_store, :clear, :*]collection, backend

Exception Events

All :exception events include:
  • kind - The exception type (:error, :exit, :throw)
  • reason - The exception or error term
  • stacktrace - Full stacktrace

Custom Handlers

For more control, attach your own handlers:
lib/my_app/arcana_metrics.ex
defmodule MyApp.ArcanaMetrics do
  require Logger

  def setup do
    events = [
      [:arcana, :ingest, :stop],
      [:arcana, :search, :stop],
      [:arcana, :ask, :stop],
      [:arcana, :llm, :complete, :stop],
      # Agent pipeline
      [:arcana, :agent, :rerank, :stop],
      [:arcana, :agent, :answer, :stop],
      # GraphRAG
      [:arcana, :graph, :build, :stop],
      [:arcana, :graph, :search, :stop]
    ]

    :telemetry.attach_many("my-arcana-metrics", events, &handle_event/4, nil)
  end

  def handle_event([:arcana, :search, :stop], measurements, metadata, _config) do
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

    Logger.info("Search: #{metadata.result_count} results in #{duration_ms}ms",
      query: metadata.query,
      mode: metadata.mode
    )
  end

  def handle_event([:arcana, :llm, :complete, :stop], measurements, metadata, _config) do
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

    if metadata.success do
      Logger.debug("LLM call to #{metadata.model}: #{duration_ms}ms")
    else
      Logger.warning("LLM call failed: #{metadata.error}")
    end
  end

  # ... handle other events
end
Call MyApp.ArcanaMetrics.setup() in your application startup.

Phoenix LiveDashboard Integration

Add Arcana metrics to your LiveDashboard:
lib/my_app/telemetry.ex
defmodule MyApp.Telemetry do
  import Telemetry.Metrics

  def metrics do
    [
      # Core operations
      summary("arcana.search.stop.duration",
        unit: {:native, :millisecond},
        tags: [:mode]
      ),
      counter("arcana.search.stop.result_count"),
      
      # LLM calls
      summary("arcana.llm.complete.stop.duration",
        unit: {:native, :millisecond},
        tags: [:model]
      ),
      
      # Agent pipeline
      summary("arcana.agent.rerank.stop.duration",
        unit: {:native, :millisecond}
      ),
      last_value("arcana.agent.rerank.stop.kept"),
      
      # GraphRAG
      summary("arcana.graph.build.stop.duration",
        unit: {:native, :millisecond}
      ),
      last_value("arcana.graph.build.stop.entity_count")
    ]
  end
end
Configure LiveDashboard:
lib/my_app_web/router.ex
live_dashboard "/dashboard",
  metrics: MyApp.Telemetry

Prometheus Integration

For production monitoring with Prometheus, use prom_ex:
mix.exs
{:prom_ex, "~> 1.9"}
lib/my_app/prom_ex/arcana_plugin.ex
defmodule MyApp.PromEx.ArcanaPlugin do
  use PromEx.Plugin

  @impl true
  def event_metrics(_opts) do
    Event.build(
      :arcana_event_metrics,
      [
        distribution(
          [:arcana, :search, :duration, :milliseconds],
          event_name: [:arcana, :search, :stop],
          measurement: :duration,
          unit: {:native, :millisecond},
          tags: [:mode],
          tag_values: fn metadata -> %{mode: metadata[:mode] || :semantic} end
        ),
        distribution(
          [:arcana, :llm, :complete, :duration, :milliseconds],
          event_name: [:arcana, :llm, :complete, :stop],
          measurement: :duration,
          unit: {:native, :millisecond},
          tags: [:model, :success],
          tag_values: fn metadata ->
            %{model: metadata[:model] || "unknown", success: metadata[:success]}
          end
        ),
        counter(
          [:arcana, :ingest, :chunks, :total],
          event_name: [:arcana, :ingest, :stop],
          measurement: fn _measurements, metadata -> metadata[:chunk_count] || 0 end
        )
      ]
    )
  end
end

Debugging Performance Issues

Identify Slow Operations

The built-in logger makes it easy to spot bottlenecks:
[info] [Arcana] embed completed in 45ms (384 dims)
[info] [Arcana] search completed in 12ms (10 results)
[info] [Arcana] llm.complete completed in 3.2s [openai:gpt-4o] ok (1892 chars)
[info] [Arcana] ask completed in 3.3s
In this example, the LLM call dominates total time (3.2s of 3.3s).

Track Agent Pipeline Steps

For agentic RAG, each pipeline step is instrumented:
[info] [Arcana] agent.rerank completed in 890ms (8/25 kept)
[info] [Arcana] agent.reason completed in 850ms (1 iteration)
If reranking takes too long, consider:
  • Reducing chunks before reranking (lower search limit)
  • Using a faster reranking threshold
  • Implementing a custom reranker
If reason/2 is taking too long:
  • Lower max_iterations (default: 2)
  • Improve initial search quality with query expansion

Monitor LLM Costs

Track prompt sizes to estimate API costs:
def handle_event([:arcana, :llm, :complete, :stop], measurements, metadata, _config) do
  # Rough token estimate (4 chars per token)
  prompt_tokens = div(metadata.prompt_length || 0, 4)
  response_tokens = div(metadata.response_length || 0, 4)

  Logger.info("LLM usage",
    model: metadata.model,
    prompt_tokens: prompt_tokens,
    response_tokens: response_tokens,
    success: metadata.success
  )
end

Error Tracking

Handle exceptions to send to your error tracking service:
def setup do
  exception_events = [
    [:arcana, :ingest, :exception],
    [:arcana, :search, :exception],
    [:arcana, :ask, :exception],
    [:arcana, :llm, :complete, :exception]
  ]

  :telemetry.attach_many("arcana-errors", exception_events, &handle_exception/4, nil)
end

def handle_exception(event, _measurements, metadata, _config) do
  # Send to Sentry, Honeybadger, etc.
  Sentry.capture_message("Arcana error",
    extra: %{
      event: inspect(event),
      kind: metadata.kind,
      reason: inspect(metadata.reason)
    }
  )
end

Best Practices

Start with built-in logger

Zero-config logging helps you understand what’s happening

Focus on LLM latency

LLM calls are usually the bottleneck - track them closely

Monitor reranking

Watch the kept/original ratio when using Agent.rerank/2

Tag by collection

Identify slow document sets by tagging metrics with collection names

Set up alerts

Alert on LLM failures and unusually slow operations

Log in production

Keep at least :info level logging for debugging issues

Next Steps

Evaluation

Measure retrieval quality alongside performance

Dashboard

View operations and results in the web UI

Agentic RAG

Monitor complex pipeline steps

GraphRAG

Track entity extraction and graph building

Build docs developers (and LLMs) love