How Gemini CLI Works

Gemini CLI is an AI agent that brings Google’s Gemini models directly into your terminal. Understanding how it works will help you use it more effectively and customize it to your needs.

Architecture Overview

Gemini CLI is built with two primary packages that work together:

CLI Package

Provides the user interface, handles user input, and manages the interactive terminal session

Core Package

Manages API communication, tool orchestration, and conversation state

How a Request Flows Through the System

When you send a prompt to Gemini CLI, it goes through several stages:

User Input

You enter a request in the terminal. This can include natural language prompts, special syntax like @file.txt for file references, or !command for shell commands.

Context Assembly

The core package constructs the full prompt by combining:

Your user input
Conversation history
Tool definitions
Context from GEMINI.md files
System instructions

API Communication

The constructed prompt is sent to the Gemini API with the selected model and generation settings.

Tool Request Analysis

The model analyzes your request and determines if tools are needed to fulfill it. If so, it requests specific tools with parameters.

Tool Validation & Confirmation

The CLI validates tool parameters and checks security policies. For sensitive operations (file writes, shell commands), you’re prompted for approval.

Tool Execution

Approved tools execute and return results. These results are sent back to the model for further processing.

Response Generation

The model uses tool results to generate a final, grounded answer that’s displayed in your terminal.

Core Components

Prompt Engineering

The core package constructs effective prompts for the Gemini model by:

Incorporating conversation history for context continuity
Including tool definitions so the model knows what actions it can perform
Adding instructional context from hierarchical GEMINI.md files
Applying system-level instructions that guide the model’s behavior

Tool Management & Orchestration

The tool system is the bridge between the AI model and your local environment:

Tool Registration: Available tools (file operations, shell commands, web access) are registered with the core
Request Interpretation: When the model requests a tool, the core interprets the request and validates parameters
Execution: The core executes the tool with security safeguards
Result Handling: Tool outputs are formatted and returned to the model

See the Tools documentation for detailed information about available tools.

Session and State Management

The core maintains conversation state including:

Conversation history: Previous messages and responses
Tool execution results: Outcomes of previous tool calls
Context files: Loaded GEMINI.md content
Session metadata: Token usage, model selection, and settings

The CLI footer displays the number of loaded context files, giving you a visual indicator of active instructional context.

Chat History Compression

Long conversations can exceed model token limits. To handle this, Gemini CLI automatically compresses conversation history when approaching limits. The compression is designed to be lossless in terms of information conveyed while reducing token count. This allows you to maintain extended sessions without losing important context.

You can check current token usage with the /stats model command to see your session’s usage and quota information.

Model Fallback & Routing

Gemini CLI includes intelligent model routing to ensure uninterrupted service:

Automatic Fallback

If your selected model encounters issues (rate limiting, quota exhaustion, server errors):

The CLI detects the failure
You’re prompted to switch to a fallback model (unless configured for silent fallback)
If approved, the CLI uses the fallback model for the current turn or remainder of the session

Model Selection Precedence

The model used is determined in this order:

--model flag: Specified when launching the CLI
GEMINI_MODEL environment variable: Set in your shell environment
model.name in settings.json: Configured in your settings file
Default model: auto (lets the system choose the best model)

Internal utility calls (like prompt completion) use a silent fallback chain from gemini-2.5-flash-lite → gemini-2.5-flash → gemini-2.5-pro without changing your configured model.

File Discovery Service

The file discovery service helps the model find relevant files in your project. It’s used by:

The @ command for including file contents
Tools that need to access files
Pattern-based file searches with glob patterns

The service respects .gitignore and .geminiignore patterns to exclude unwanted files.

Memory Discovery Service

The memory discovery service finds and loads GEMINI.md context files in a hierarchical manner:

Global context: ~/.gemini/GEMINI.md (applies to all projects)
Workspace context: GEMINI.md files in configured workspace directories
Just-in-time context: When tools access a file or directory, GEMINI.md files in that path are automatically loaded

All discovered context files are concatenated and sent to the model with every prompt.

Use /memory show to view the full concatenated context being sent to the model, or /memory refresh to reload all context files.

Security Considerations

Security is built into the core architecture:

API Key Management

The core securely handles your GEMINI_API_KEY or authentication credentials, ensuring they’re never exposed to the model or logged inappropriately.

Tool Execution Safety

User confirmation: File modifications and shell commands require manual approval
Sandboxing: Tools can run in containerized environments to isolate changes
Trusted folders: Configure which directories allow system tool execution
Policy engine: Fine-grained control over tool execution permissions

Always review confirmation prompts carefully before approving tool execution, especially for file writes and shell commands.

Subagents (Experimental)

Gemini CLI can delegate complex tasks to specialized subagents:

Focused context: Each subagent has its own system prompt and persona
Specialized tools: Restricted or domain-specific toolsets
Independent context: Separate conversation loop saves tokens in main session

Built-in Subagents

Codebase Investigator

Analyzes codebases, reverse engineers, and understands complex dependencies

CLI Help Agent

Expert knowledge about Gemini CLI commands, configuration, and documentation

Browser Agent

Automates web browser tasks using the accessibility tree (requires Chrome 144+)

Generalist Agent

Routes tasks to appropriate specialized subagents

Subagents are experimental. Enable custom subagents with "experimental": { "enableAgents": true } in your settings.json.

Citations

When Gemini finds it’s reciting text from a source, it automatically appends citations to the output. Citations are:

Shown before edit confirmations
Displayed at the end of the model’s turn
Deduplicated and sorted alphabetically

You can disable citations with the ui.showCitations setting if needed.

Next Steps

Available Models

Learn about model selection and when to use Pro, Flash, or Auto modes

Tool System

Explore the tools available for file operations, shell commands, and more

Context Management

Master GEMINI.md files for persistent project-specific instructions

Configuration

Customize Gemini CLI settings and behavior

Get Started

Core Concepts

CLI Usage

Tools & Extensions

Advanced

Resources

How Gemini CLI Works

Architecture Overview

CLI Package

Core Package

How a Request Flows Through the System

Core Components

Prompt Engineering

Tool Management & Orchestration

Session and State Management

Chat History Compression

Model Fallback & Routing

Automatic Fallback

Model Selection Precedence

File Discovery Service

Memory Discovery Service

Security Considerations

API Key Management

Tool Execution Safety

Subagents (Experimental)

Built-in Subagents

Codebase Investigator

CLI Help Agent

Browser Agent

Generalist Agent

Citations

Next Steps

Available Models

Tool System

Context Management

Configuration

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Usage

Tools & Extensions

Advanced

Resources

​Architecture Overview

CLI Package

Core Package

​How a Request Flows Through the System

​Core Components

​Prompt Engineering

​Tool Management & Orchestration

​Session and State Management

​Chat History Compression

​Model Fallback & Routing

​Automatic Fallback

​Model Selection Precedence

​File Discovery Service

​Memory Discovery Service

​Security Considerations

​API Key Management

​Tool Execution Safety

​Subagents (Experimental)

​Built-in Subagents

Codebase Investigator

CLI Help Agent

Browser Agent

Generalist Agent

​Citations

​Next Steps

Available Models

Tool System

Context Management

Configuration

Build docs developers (and LLMs) love

Architecture Overview

How a Request Flows Through the System

Core Components

Prompt Engineering

Tool Management & Orchestration

Session and State Management

Chat History Compression

Model Fallback & Routing

Automatic Fallback

Model Selection Precedence

File Discovery Service

Memory Discovery Service

Security Considerations

API Key Management

Tool Execution Safety

Subagents (Experimental)

Built-in Subagents

Citations

Next Steps