Overview
The LlmResponse class represents the response from an LLM generation request. It contains the generated content, usage metadata, grounding information, error details, and streaming state.
Class Definition
class LlmResponse {
id ?: string ;
text ?: string ;
content ?: Content ;
groundingMetadata ?: GroundingMetadata ;
partial ?: boolean ;
turnComplete ?: boolean ;
errorCode ?: string ;
errorMessage ?: string ;
interrupted ?: boolean ;
customMetadata ?: Record < string , any >;
cacheMetadata ?: CacheMetadata ;
usageMetadata ?: GenerateContentResponseUsageMetadata ;
candidateIndex ?: number ;
finishReason ?: string ;
error ?: Error ;
constructor ( data ?: Partial < LlmResponse >);
static create ( generateContentResponse : GenerateContentResponse ) : LlmResponse ;
static fromError ( error : unknown , options ?: { errorCode ?: string ; model ?: string }) : LlmResponse ;
}
Properties
Unique identifier for the response.
Plain text content of the response. Convenient accessor for simple text responses. console . log ( response . text ); // "Hello, how can I help?"
Structured content with role and parts. Contains the full response including text, function calls, and other parts. {
role : "model" ,
parts : [
{ text: "The weather is sunny" },
{ functionCall: { name: "get_weather" , args: { ... } } }
]
}
Metadata about grounding sources when using grounded generation. Contains citations and search results.
Indicates if this is a partial response in a streaming context. true means more chunks are expected.
Indicates if the conversation turn is complete. Used in bidirectional streaming.
Error code if the response encountered an error. Common values:
"STOP" - Normal completion
"MAX_TOKENS" - Hit token limit
"SAFETY" - Blocked by safety filters
"RECITATION" - Blocked due to recitation
"OTHER" - Other error
"UNKNOWN_ERROR" - Unknown error
Human-readable error message providing details about any error that occurred.
Indicates if the generation was interrupted (e.g., by user or timeout).
Custom metadata that implementations can attach to responses.
Metadata about context caching for this request/response.
usageMetadata
GenerateContentResponseUsageMetadata
Token usage information for billing and monitoring. Show Usage Metadata Properties
Number of tokens in the input prompt.
Number of tokens in the generated response.
Total tokens (prompt + candidates).
Number of tokens loaded from cache.
console . log ( `Used ${ response . usageMetadata ?. totalTokenCount } tokens` );
Index of the candidate response when multiple candidates are requested.
Reason why generation finished:
"STOP" - Natural completion
"MAX_TOKENS" - Reached token limit
"SAFETY" - Safety filter triggered
"RECITATION" - Recitation detected
"OTHER" - Other reason
Original Error object if the response resulted from an exception.
Constructor
Optional initialization data for any properties.
const response = new LlmResponse ({
text: "Hello!" ,
content: {
role: "model" ,
parts: [{ text: "Hello!" }],
},
finishReason: "STOP" ,
});
Static Methods
create()
Creates an LlmResponse from a standard GenerateContentResponse object. Handles various response scenarios including successful generations, errors, and prompt feedback.
generateContentResponse
GenerateContentResponse
required
The raw response from the LLM API.
Returns: LlmResponse
const llmResponse = LlmResponse . create ( apiResponse );
Behavior:
Extracts candidate content and metadata when available
Maps finish reasons to error codes when generation failed
Captures prompt feedback for safety/policy violations
Always preserves usage metadata
Returns error response for unknown failures
fromError()
Creates an error LlmResponse from an exception, useful for consistent error handling across different LLM providers.
The error or exception that occurred.
Optional error configuration. Custom error code. Defaults to "UNKNOWN_ERROR".
Model name for error message context.
Returns: LlmResponse
try {
const response = await callLlm ();
} catch ( error ) {
const errorResponse = LlmResponse . fromError ( error , {
errorCode: "API_ERROR" ,
model: "gpt-4" ,
});
console . error ( errorResponse . errorMessage );
}
Response Structure:
Sets errorCode and errorMessage
Creates error content with error text
Sets finishReason to "STOP"
Preserves original Error object
Formats message as: "LLM call failed for model {model}: {errorMessage}"
Usage Examples
Handling Streaming Responses
import { BaseLlm , LlmRequest } from "@iqai/adk" ;
const llm = new MyLlm ( "gpt-4" );
const request = new LlmRequest ({ /* ... */ });
let fullText = "" ;
for await ( const response of llm . generateContentAsync ( request , true )) {
if ( response . text ) {
fullText += response . text ;
process . stdout . write ( response . text );
}
if ( response . errorCode ) {
console . error ( `Error: ${ response . errorMessage } ` );
break ;
}
// Check if we've hit token limit
if ( response . finishReason === "MAX_TOKENS" ) {
console . warn ( "Response truncated due to token limit" );
}
}
Monitoring Token Usage
const response = await getFirstResponse ( llm , request );
if ( response . usageMetadata ) {
const { promptTokenCount , candidatesTokenCount , totalTokenCount } =
response . usageMetadata ;
console . log ( `Input tokens: ${ promptTokenCount } ` );
console . log ( `Output tokens: ${ candidatesTokenCount } ` );
console . log ( `Total tokens: ${ totalTokenCount } ` );
// Calculate cost (example rates)
const inputCost = ( promptTokenCount / 1000000 ) * 0.50 ;
const outputCost = ( candidatesTokenCount / 1000000 ) * 1.50 ;
console . log ( `Estimated cost: $ ${ ( inputCost + outputCost ). toFixed ( 4 ) } ` );
}
Handling Function Calls
for await ( const response of llm . generateContentAsync ( request )) {
if ( response . content ?. parts ) {
for ( const part of response . content . parts ) {
if ( part . functionCall ) {
console . log ( `Calling function: ${ part . functionCall . name } ` );
console . log ( `Arguments:` , part . functionCall . args );
// Execute the function
const tool = request . toolsDict [ part . functionCall . name ];
const result = await tool . execute ( part . functionCall . args );
// Send result back to LLM...
} else if ( part . text ) {
console . log ( `Text response: ${ part . text } ` );
}
}
}
}
Error Handling Patterns
const response = await getResponse ( llm , request );
// Check for errors
if ( response . errorCode && response . errorCode !== "STOP" ) {
switch ( response . errorCode ) {
case "SAFETY" :
console . error ( "Response blocked by safety filters" );
break ;
case "MAX_TOKENS" :
console . warn ( "Response truncated - increase maxOutputTokens" );
break ;
case "RECITATION" :
console . error ( "Response blocked due to recitation" );
break ;
default :
console . error ( `Error: ${ response . errorMessage } ` );
}
return ;
}
// Process successful response
if ( response . text ) {
console . log ( response . text );
}
Working with Grounding
for await ( const response of llm . generateContentAsync ( request )) {
if ( response . groundingMetadata ) {
console . log ( "Response includes grounded information:" );
// Access search results and citations
const metadata = response . groundingMetadata ;
if ( metadata . searchEntryPoint ) {
console . log ( `Search query: ${ metadata . searchEntryPoint . renderedContent } ` );
}
if ( metadata . groundingChunks ) {
console . log ( `Sources: ${ metadata . groundingChunks . length } ` );
}
}
console . log ( response . text );
}
Response Aggregation for Streaming
When using streaming, aggregate partial responses:
function aggregateStreamingResponses ( responses : LlmResponse []) : LlmResponse {
const aggregated = new LlmResponse ({
content: { role: "model" , parts: [] },
text: "" ,
});
for ( const response of responses ) {
// Merge text
if ( response . text ) {
aggregated . text = ( aggregated . text || "" ) + response . text ;
}
// Merge content parts
if ( response . content ?. parts ) {
aggregated . content ! . parts . push ( ... response . content . parts );
}
// Use latest metadata
if ( response . usageMetadata ) {
aggregated . usageMetadata = response . usageMetadata ;
}
if ( response . finishReason ) {
aggregated . finishReason = response . finishReason ;
}
}
return aggregated ;
}
BaseLlm - Base class that generates LlmResponse
LlmRequest - Request configuration that produces responses
CacheMetadata - Cache metadata attached to responses
Type Imports
import type {
Content ,
GenerateContentResponseUsageMetadata ,
GroundingMetadata ,
} from "@google/genai" ;
import { CacheMetadata } from "@adk/models" ;
Source Reference
See implementation: /packages/adk/src/models/llm-response.ts