Skip to main content
The Semantic Conventions define the keys and values which describe commonly observed concepts, protocols, and operations used by applications. These conventions are used to populate the attributes of spans and span events.

Span Kinds

The openinference.span.kind attribute is required for all OpenInference spans and identifies the type of operation being traced. The span kind provides a hint to the tracing backend as to how the trace should be assembled. Valid values include:
Span Kind ValueDescription
LLMA span that represents a call to a Large Language Model (LLM). For example, an LLM span could be used to represent a call to OpenAI or Llama for chat completions or text generation.
EMBEDDINGA span that represents a call to an LLM or embedding service for generating embeddings. For example, an Embedding span could be used to represent a call to OpenAI to get an ada embedding for retrieval.
CHAINA span that represents a starting point or a link between different LLM application steps. For example, a Chain span could be used to represent the beginning of a request to an LLM application or the glue code that passes context from a retriever to an LLM call.
RETRIEVERA span that represents a data retrieval step. For example, a Retriever span could be used to represent a call to a vector store or a database to fetch documents or information.
RERANKERA span that represents the reranking of a set of input documents. For example, a cross-encoder may be used to compute the input documents’ relevance scores with respect to a user query, and the top K documents with the highest scores are then returned by the Reranker.
TOOLA span that represents a call to an external tool such as a calculator, weather API, or any function execution that is invoked by an LLM or agent.
AGENTA span that encompasses calls to LLMs and Tools. An agent describes a reasoning block that acts on tools using the guidance of an LLM.
GUARDRAILA span that represents calls to a component to protect against jailbreak user input prompts by taking action to modify or reject an LLM’s response if it contains undesirable content. For example, a Guardrail span could involve checking if an LLM’s output response contains inappropriate language, via a custom or external guardrail library, and then amending the LLM response to remove references to the inappropriate language.
EVALUATORA span that represents a call to a function or process performing an evaluation of the language model’s outputs. Examples include assessing the relevance, correctness, or helpfulness of the language model’s answers.
PROMPTA span that represents the rendering of a prompt template. For example, a Prompt span could be used to represent the rendering a template with variables.

Reserved Attributes

The following attributes are reserved and MUST be supported by all OpenInference Tracing SDKs:
List-valued attributes (marked with †) use zero-based indexing in their flattened form. All list-based attributes are flattened using indexed prefixes (e.g., llm.input_messages.0.message.role).
AttributeTypeExampleDescription
document.contentString"This is a sample document content."The content of a retrieved document
document.idString/Integer"1234" or 1Unique identifier for a document
document.metadataJSON String"{'author': 'John Doe', 'date': '2023-09-09'}"Metadata associated with a document
document.scoreFloat0.98Score representing the relevance of a document
embedding.embeddingsList of objects†[{"embedding.vector": [...], "embedding.text": "hello"}]List of embedding objects including text and vector data
embedding.invocation_parametersJSON String"{\"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}"Parameters used during the invocation of an embedding model or API (excluding input)
embedding.model_nameString"BERT-base"Name of the embedding model used
embedding.textString"hello world"The text represented in the embedding
embedding.vectorList of floats[0.123, 0.456, ...]The embedding vector consisting of a list of floats
exception.escapedBooleantrueIndicator if the exception has escaped the span’s scope
exception.messageString"Null value encountered"Detailed message describing the exception
exception.stacktraceString"at app.main(app.java:16)"The stack trace of the exception
exception.typeString"NullPointerException"The type of exception that was thrown
image.urlString"https://sample-link-to-image.jpg"The link to the image or its base64 encoding
input.mime_typeString"text/plain" or "application/json"MIME type representing the format of input.value
input.valueString"{'query': 'What is the weather today?'}"The input value to an operation
llm.promptsList of objects†[{"prompt.text": "def fib(n):..."}]Prompts provided to a completions API
llm.choicesList of objects†[{"completion.text": " + fib(n-3)..."}]Text choices returned from a completions API
llm.function_callJSON String"{function_name: 'add', args: [1, 2]}"Object recording details of a function call in models or APIs
llm.input_messagesList of objects†[{"message.role": "user", "message.content": "hello"}]List of messages sent to the LLM in a chat API request
llm.invocation_parametersJSON string"{model_name: 'gpt-3', temperature: 0.7}"Parameters used during the invocation of an LLM or API
llm.providerStringopenai, azureThe hosting provider of the llm, e.g. azure
llm.systemStringanthropic, openaiThe AI product as identified by the client or server instrumentation
llm.model_nameString"gpt-3.5-turbo"The name of the language model being utilized
llm.output_messagesList of objects†[{"message.role": "assistant", "message.content": "hello"}]List of messages received from the LLM in a chat API response
llm.prompt_template.templateString"Weather forecast for {city} on {date}"Template used to generate prompts (e.g., using {variable} placeholder syntax)
llm.prompt_template.variablesJSON String{ context: "<context from retrieval>", subject: "math" }JSON of key value pairs applied to the prompt template
llm.prompt_template.versionString"v1.0"The version of the prompt template
llm.token_count.completionInteger15The number of tokens in the completion
llm.token_count.completion_details.reasoningInteger10The number of tokens used for model reasoning
llm.token_count.completion_details.audioInteger10The number of audio input tokens generated by the model
llm.token_count.promptInteger10The number of tokens in the prompt
llm.token_count.prompt_details.cache_readInteger5The number of prompt tokens successfully retrieved from cache (cache hits)
llm.token_count.prompt_details.cache_writeInteger0The number of prompt tokens not found in cache that were written to cache (cache misses)
llm.token_count.prompt_details.audioInteger10The number of audio input tokens presented in the prompt
llm.token_count.totalInteger20Total number of tokens, including prompt and completion
llm.cost.promptFloat0.0021Total cost of all input tokens sent to the LLM in USD
llm.cost.completionFloat0.0045Total cost of all output tokens generated by the LLM in USD
llm.cost.totalFloat0.0066Total cost of the LLM call in USD (prompt + completion)
llm.toolsList of objects†[{"tool.json_schema": "{...}"}]List of tools that are advertised to the LLM
message.contentString"What's the weather today?"The content of a message in a chat
message.contentsList of objects†[{"message_content.type": "text", "message_content.text": "Hello"}, ...]The message contents to the llm, an array of message_content objects
message.function_call_arguments_jsonJSON String"{ 'x': 2 }"The arguments to the function call in JSON
message.function_call_nameString"multiply" or "subtract"Function call function name
message.nameString"multiply"The name of the function or tool that produced a tool/function role message
message.tool_call_idString"call_62136355"Tool call result identifier corresponding to tool_call.id
message.roleString"user" or "system"Role of the entity in a message (e.g., user, system)
message.tool_callsList of objects†[{"tool_call.function.name": "get_current_weather"}]List of tool calls generated by the LLM
metadataJSON String"{'author': 'John Doe', 'date': '2023-09-09'}"Metadata associated with a span
openinference.span.kindString"LLM", "EMBEDDING", "CHAIN", etc.Required for all OpenInference spans. Identifies the type of operation
output.mime_typeString"text/plain" or "application/json"MIME type representing the format of output.value
output.valueString"Hello, World!"The output value of an operation
reranker.input_documentsList of objects†[{"document.id": "1", "document.score": 0.9}]List of documents as input to the reranker
reranker.model_nameString"cross-encoder/ms-marco-MiniLM-L-12-v2"Model name of the reranker
reranker.output_documentsList of objects†[{"document.id": "1", "document.score": 0.9}]List of documents outputted by the reranker
reranker.queryString"How to format timestamp?"Query parameter of the reranker
reranker.top_kInteger3Top K parameter of the reranker
retrieval.documentsList of objects†[{"document.id": "1", "document.score": 0.9}]List of retrieved documents
session.idString"26bcd3d2-cad2-443d-a23c-625e47f3324a"Unique identifier for a session
tag.tagsList of strings["shopping", "travel"]List of tags to give the span a category
tool.descriptionString"An API to get weather data."Description of the tool’s purpose and functionality
tool.json_schemaJSON String"{'type': 'function', 'function': {'name': 'get_weather'}}"The json schema of a tool input
tool.nameString"WeatherAPI"The name of the tool being utilized
tool.idString"call_62136355"The identifier for the result of the tool call
tool.parametersJSON string"{ 'a': 'int' }"The parameters definition for invoking the tool
user.idString"9328ae73-7141-4f45-a044-8e06192aa465"Unique identifier for a user

Well-Known Values

LLM System

llm.system has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
ValueDescription
anthropicAnthropic
openaiOpenAI
vertexaiVertex AI
cohereCohere
mistralaiMistral AI
xaixAI (Grok)
deepseekDeepSeek
amazonAmazon Bedrock native
metaMeta (Llama)
ai21AI21 Labs

LLM Provider

llm.provider has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
ValueDescription
anthropicAnthropic
openaiOpenAI
cohereCohere
mistralaiMistral AI
azureAzure
googleGoogle (Vertex)
awsAWS Bedrock
xaixAI
deepseekDeepSeek

Token Count Details

llm.token_count.prompt_details.cache_read and llm.token_count.prompt_details.cache_write provide granular token count information for cache operations, enabling detailed API usage tracking and cost analysis.
  • cache_read represents the number of prompt tokens successfully retrieved from cache (cache hits). For OpenAI, this corresponds to the usage.prompt_tokens_details.cached_tokens field in completion API responses. For Anthropic, when using a cache_control block, this maps to the cache_read_input_tokens field in Messages API responses.
  • cache_write represents the number of prompt tokens not found in cache (cache misses) that were subsequently written to cache. This metric is specific to Anthropic and corresponds to the cache_write_input_tokens field in their Messages API responses.
The extended token count attributes follow the naming pattern:
  • llm.token_count.prompt_details.* for prompt-related token counts
  • llm.token_count.completion_details.* for completion-related token counts
These attributes enable:
  • Tracking of multimodal token usage (audio tokens)
  • Monitoring reasoning token consumption for models with chain-of-thought capabilities
  • Cache efficiency analysis for cost optimization
  • Detailed billing reconciliation
All token count attributes store integer values representing the count of tokens. Cost attributes store floating point values in USD currency.

System and Model Identification

The llm.system attribute identifies the AI product/vendor, while llm.model_name contains the specific model identifier:
  • llm.system should use well-known values when applicable (e.g., “openai”, “anthropic”, “cohere”)
  • llm.model_name should contain the actual model name returned by the API (e.g., “gpt-4-0613”, “claude-3-opus-20240229”)
  • The llm.provider attribute can be used to identify the hosting provider when different from the system (e.g., “azure” for Azure-hosted OpenAI)
For embedding operations (openinference.span.kind: "EMBEDDING"):
  • llm.system and llm.provider are not used
  • Use embedding.model_name to identify the embedding model (e.g., “text-embedding-3-small”, “text-embedding-ada-002”)
  • See the Embedding Spans specification for the rationale

Attribute Naming Conventions

Indexed Attribute Prefixes

When dealing with lists of structured data, OpenInference uses indexed prefixes to create flattened attribute names. The general pattern is:
<prefix>.<index>.<suffix>
Where:
  • <prefix> is the base attribute name (e.g., llm.input_messages, llm.tools)
  • <index> is a zero-based integer index
  • <suffix> is the nested attribute path

Common Flattened Attribute Patterns

LLM Input/Output Messages

  • llm.input_messages.<index>.message.role - Role of the message (e.g., “user”, “assistant”, “system”)
  • llm.input_messages.<index>.message.content - Text content of the message
  • llm.output_messages.<index>.message.role - Role of the output message
  • llm.output_messages.<index>.message.content - Text content of the output message

Completions API (Legacy Text Completion)

For the legacy completions API (non-chat):
  • llm.prompts.<index>.prompt.text - Input prompt(s) provided to the completions API
  • llm.choices.<index>.completion.text - Text choice(s) returned from the completions API

Message Content Arrays (Multimodal)

For messages containing multiple content items (text, images, audio):
  • llm.input_messages.<messageIndex>.message.contents.<contentIndex>.message_content.text - Text content item
  • llm.input_messages.<messageIndex>.message.contents.<contentIndex>.message_content.type - Content type (“text”, “image”, “audio”)
  • llm.input_messages.<messageIndex>.message.contents.<contentIndex>.message_content.image.image.url - Image URL or base64 data

Tool Calls in Output Messages

  • llm.output_messages.<messageIndex>.message.tool_calls.<toolCallIndex>.tool_call.id - Unique identifier for the tool call
  • llm.output_messages.<messageIndex>.message.tool_calls.<toolCallIndex>.tool_call.function.name - Name of the function being called
  • llm.output_messages.<messageIndex>.message.tool_calls.<toolCallIndex>.tool_call.function.arguments - JSON string of function arguments

Available Tools

  • llm.tools.<index>.tool.json_schema - Complete JSON schema of the tool

Implementation Examples

messages = [
    {"message.role": "user", "message.content": "hello"},
    {"message.role": "assistant", "message.content": "hi"}
]

for i, obj in enumerate(messages):
    for key, value in obj.items():
        span.set_attribute(f"llm.input_messages.{i}.{key}", value)
If the objects are further nested, flattening should continue until the attribute values are either simple values (bool, str, bytes, int, float) or simple lists (List[bool], List[str], List[bytes], List[int], List[float]).

Build docs developers (and LLMs) love