Skip to main content

Overview

The LlmResponse class represents the response from an LLM generation request. It contains the generated content, usage metadata, grounding information, error details, and streaming state.

Class Definition

class LlmResponse {
  id?: string;
  text?: string;
  content?: Content;
  groundingMetadata?: GroundingMetadata;
  partial?: boolean;
  turnComplete?: boolean;
  errorCode?: string;
  errorMessage?: string;
  interrupted?: boolean;
  customMetadata?: Record<string, any>;
  cacheMetadata?: CacheMetadata;
  usageMetadata?: GenerateContentResponseUsageMetadata;
  candidateIndex?: number;
  finishReason?: string;
  error?: Error;

  constructor(data?: Partial<LlmResponse>);
  static create(generateContentResponse: GenerateContentResponse): LlmResponse;
  static fromError(error: unknown, options?: { errorCode?: string; model?: string }): LlmResponse;
}

Properties

id
string
Unique identifier for the response.
text
string
Plain text content of the response. Convenient accessor for simple text responses.
console.log(response.text); // "Hello, how can I help?"
content
Content
Structured content with role and parts. Contains the full response including text, function calls, and other parts.
{
  role: "model",
  parts: [
    { text: "The weather is sunny" },
    { functionCall: { name: "get_weather", args: {...} } }
  ]
}
groundingMetadata
GroundingMetadata
Metadata about grounding sources when using grounded generation. Contains citations and search results.
partial
boolean
Indicates if this is a partial response in a streaming context. true means more chunks are expected.
turnComplete
boolean
Indicates if the conversation turn is complete. Used in bidirectional streaming.
errorCode
string
Error code if the response encountered an error. Common values:
  • "STOP" - Normal completion
  • "MAX_TOKENS" - Hit token limit
  • "SAFETY" - Blocked by safety filters
  • "RECITATION" - Blocked due to recitation
  • "OTHER" - Other error
  • "UNKNOWN_ERROR" - Unknown error
errorMessage
string
Human-readable error message providing details about any error that occurred.
interrupted
boolean
Indicates if the generation was interrupted (e.g., by user or timeout).
customMetadata
Record<string, any>
Custom metadata that implementations can attach to responses.
cacheMetadata
CacheMetadata
Metadata about context caching for this request/response.
usageMetadata
GenerateContentResponseUsageMetadata
Token usage information for billing and monitoring.
console.log(`Used ${response.usageMetadata?.totalTokenCount} tokens`);
candidateIndex
number
Index of the candidate response when multiple candidates are requested.
finishReason
string
Reason why generation finished:
  • "STOP" - Natural completion
  • "MAX_TOKENS" - Reached token limit
  • "SAFETY" - Safety filter triggered
  • "RECITATION" - Recitation detected
  • "OTHER" - Other reason
error
Error
Original Error object if the response resulted from an exception.

Constructor

data
Partial<LlmResponse>
Optional initialization data for any properties.
const response = new LlmResponse({
  text: "Hello!",
  content: {
    role: "model",
    parts: [{ text: "Hello!" }],
  },
  finishReason: "STOP",
});

Static Methods

create()

Creates an LlmResponse from a standard GenerateContentResponse object. Handles various response scenarios including successful generations, errors, and prompt feedback.
generateContentResponse
GenerateContentResponse
required
The raw response from the LLM API.
Returns: LlmResponse
const llmResponse = LlmResponse.create(apiResponse);
Behavior:
  • Extracts candidate content and metadata when available
  • Maps finish reasons to error codes when generation failed
  • Captures prompt feedback for safety/policy violations
  • Always preserves usage metadata
  • Returns error response for unknown failures

fromError()

Creates an error LlmResponse from an exception, useful for consistent error handling across different LLM providers.
error
unknown
required
The error or exception that occurred.
options
object
Optional error configuration.
Returns: LlmResponse
try {
  const response = await callLlm();
} catch (error) {
  const errorResponse = LlmResponse.fromError(error, {
    errorCode: "API_ERROR",
    model: "gpt-4",
  });
  console.error(errorResponse.errorMessage);
}
Response Structure:
  • Sets errorCode and errorMessage
  • Creates error content with error text
  • Sets finishReason to "STOP"
  • Preserves original Error object
  • Formats message as: "LLM call failed for model {model}: {errorMessage}"

Usage Examples

Handling Streaming Responses

import { BaseLlm, LlmRequest } from "@iqai/adk";

const llm = new MyLlm("gpt-4");
const request = new LlmRequest({ /* ... */ });

let fullText = "";
for await (const response of llm.generateContentAsync(request, true)) {
  if (response.text) {
    fullText += response.text;
    process.stdout.write(response.text);
  }

  if (response.errorCode) {
    console.error(`Error: ${response.errorMessage}`);
    break;
  }

  // Check if we've hit token limit
  if (response.finishReason === "MAX_TOKENS") {
    console.warn("Response truncated due to token limit");
  }
}

Monitoring Token Usage

const response = await getFirstResponse(llm, request);

if (response.usageMetadata) {
  const { promptTokenCount, candidatesTokenCount, totalTokenCount } =
    response.usageMetadata;

  console.log(`Input tokens: ${promptTokenCount}`);
  console.log(`Output tokens: ${candidatesTokenCount}`);
  console.log(`Total tokens: ${totalTokenCount}`);

  // Calculate cost (example rates)
  const inputCost = (promptTokenCount / 1000000) * 0.50;
  const outputCost = (candidatesTokenCount / 1000000) * 1.50;
  console.log(`Estimated cost: $${(inputCost + outputCost).toFixed(4)}`);
}

Handling Function Calls

for await (const response of llm.generateContentAsync(request)) {
  if (response.content?.parts) {
    for (const part of response.content.parts) {
      if (part.functionCall) {
        console.log(`Calling function: ${part.functionCall.name}`);
        console.log(`Arguments:`, part.functionCall.args);

        // Execute the function
        const tool = request.toolsDict[part.functionCall.name];
        const result = await tool.execute(part.functionCall.args);
        // Send result back to LLM...
      } else if (part.text) {
        console.log(`Text response: ${part.text}`);
      }
    }
  }
}

Error Handling Patterns

const response = await getResponse(llm, request);

// Check for errors
if (response.errorCode && response.errorCode !== "STOP") {
  switch (response.errorCode) {
    case "SAFETY":
      console.error("Response blocked by safety filters");
      break;
    case "MAX_TOKENS":
      console.warn("Response truncated - increase maxOutputTokens");
      break;
    case "RECITATION":
      console.error("Response blocked due to recitation");
      break;
    default:
      console.error(`Error: ${response.errorMessage}`);
  }
  return;
}

// Process successful response
if (response.text) {
  console.log(response.text);
}

Working with Grounding

for await (const response of llm.generateContentAsync(request)) {
  if (response.groundingMetadata) {
    console.log("Response includes grounded information:");

    // Access search results and citations
    const metadata = response.groundingMetadata;
    if (metadata.searchEntryPoint) {
      console.log(`Search query: ${metadata.searchEntryPoint.renderedContent}`);
    }

    if (metadata.groundingChunks) {
      console.log(`Sources: ${metadata.groundingChunks.length}`);
    }
  }

  console.log(response.text);
}

Response Aggregation for Streaming

When using streaming, aggregate partial responses:
function aggregateStreamingResponses(responses: LlmResponse[]): LlmResponse {
  const aggregated = new LlmResponse({
    content: { role: "model", parts: [] },
    text: "",
  });

  for (const response of responses) {
    // Merge text
    if (response.text) {
      aggregated.text = (aggregated.text || "") + response.text;
    }

    // Merge content parts
    if (response.content?.parts) {
      aggregated.content!.parts.push(...response.content.parts);
    }

    // Use latest metadata
    if (response.usageMetadata) {
      aggregated.usageMetadata = response.usageMetadata;
    }

    if (response.finishReason) {
      aggregated.finishReason = response.finishReason;
    }
  }

  return aggregated;
}
  • BaseLlm - Base class that generates LlmResponse
  • LlmRequest - Request configuration that produces responses
  • CacheMetadata - Cache metadata attached to responses

Type Imports

import type {
  Content,
  GenerateContentResponseUsageMetadata,
  GroundingMetadata,
} from "@google/genai";
import { CacheMetadata } from "@adk/models";

Source Reference

See implementation: /packages/adk/src/models/llm-response.ts

Build docs developers (and LLMs) love