Agent Configuration

The agents section controls how Grip’s AI agents behave, which models they use, and how they manage context and memory.

Agent Defaults

Default parameters applied to every agent run unless overridden by profiles or CLI flags.

Model Selection

agents.defaults.model

string

default:"openrouter/anthropic/claude-sonnet-4"

Default LLM model in provider/model format.Examples:

openrouter/anthropic/claude-sonnet-4
anthropic/claude-sonnet-4-20250514
openai/gpt-4o
deepseek/deepseek-chat

agents.defaults.provider

string

default:""

Explicit provider name to override prefix-based detection.Useful when model names are ambiguous (e.g., openai/gpt-oss-120b on OpenRouter).Options: openrouter, anthropic, openai, deepseek, groq, gemini, etc.

agents.defaults.engine

string

default:"claude_sdk"

Agent execution engine.

claude_sdk - Primary engine using Claude’s Agent SDK (Claude models only)
litellm - Fallback engine supporting any model via LiteLLM

agents.defaults.sdk_model

string

default:"claude-sonnet-4-6"

Claude model to use when engine=claude_sdk.Options:

claude-opus-4-6
claude-sonnet-4-6
claude-haiku-4-5-20251001

Generation Parameters

agents.defaults.max_tokens

integer

default:"8192"

Maximum tokens the LLM can generate per response.

agents.defaults.temperature

float

default:"0.7"

Sampling temperature for LLM responses.

Lower (0.0-0.5): More deterministic, focused
Medium (0.5-1.0): Balanced creativity
Higher (1.0-2.0): More creative, varied

Execution Control

agents.defaults.max_tool_iterations

integer

default:"0"

Maximum LLM-tool round-trips before the agent stops.

0 = unlimited (default)
Set to a positive number to limit agent autonomy

agents.defaults.dry_run

boolean

default:"false"

When true, tools simulate execution without writing files or running commands.Useful for testing agent behavior safely.

Memory and Context

agents.defaults.memory_window

integer

default:"50"

Number of recent messages to include in LLM context.Larger values provide more context but consume more tokens.

agents.defaults.auto_consolidate

boolean

default:"true"

Automatically consolidate old messages when session exceeds 2x memory window.Summarizes older messages to reduce token usage while preserving key information.

agents.defaults.consolidation_model

string

default:""

LLM model for summarization/consolidation.

Empty string = use main model
Set to a cheaper model (e.g., openrouter/google/gemini-flash-2.0) to save tokens

agents.defaults.enable_self_correction

boolean

default:"true"

When true, the agent reflects on failed tool calls before proceeding.Improves reliability but adds extra LLM calls for error recovery.

Caching and Rate Limiting

agents.defaults.semantic_cache_enabled

boolean

default:"true"

Cache LLM responses for identical queries to save tokens and latency.

agents.defaults.semantic_cache_ttl

integer

default:"3600"

Time-to-live for cached responses in seconds (default: 1 hour).

agents.defaults.max_daily_tokens

integer

default:"0"

Maximum total tokens (prompt + completion) per day.

0 = unlimited
Set a limit to control costs

Workspace

agents.defaults.workspace

string

default:"~/.grip/workspace"

Root workspace directory for agent files, sessions, and memory.

agents.defaults.sdk_permission_mode

string

default:"acceptEdits"

SDK permission mode for file operations.

acceptEdits - Auto-accept file edits (default)
bypassPermissions - Skip all permission checks
default - Prompt for each operation

Model Tiers (Cost-Aware Routing)

Automatic model routing based on prompt complexity. Allows using cheaper models for simple tasks and powerful models for complex ones.

agents.model_tiers.enabled

boolean

default:"false"

Enable automatic model routing based on prompt complexity.

agents.model_tiers.low

string

default:""

Model for simple queries (greetings, lookups, regex).Example: openrouter/google/gemini-flash-2.0

agents.model_tiers.medium

string

default:""

Model for moderate tasks (code changes, explanations).Leave empty to use agents.defaults.model.

agents.model_tiers.high

string

default:""

Model for complex tasks (architecture, refactors, debugging).Example: anthropic/claude-opus-4-6

Example Configuration

{
  "agents": {
    "defaults": {
      "model": "openrouter/anthropic/claude-sonnet-4",
      "temperature": 0.7,
      "max_tokens": 8192,
      "memory_window": 50
    },
    "model_tiers": {
      "enabled": true,
      "low": "openrouter/google/gemini-flash-2.0",
      "medium": "",
      "high": "anthropic/claude-opus-4-6"
    }
  }
}

Agent Profiles

Named profiles with custom models, tool subsets, and system prompts. Profiles let you create specialized agents for specific tasks.

agents.profiles.{name}.model

string

default:""

Model override for this profile. Empty = inherit from defaults.

agents.profiles.{name}.max_tokens

integer

default:"0"

Max tokens override. 0 = inherit from defaults.

agents.profiles.{name}.temperature

float

default:"-1.0"

Temperature override. -1.0 = inherit from defaults.

agents.profiles.{name}.max_tool_iterations

integer

default:"0"

Max iterations override. 0 = inherit from defaults.

agents.profiles.{name}.tools_allowed

array

default:"[]"

Tool names this profile can use. Empty = all tools.Supports wildcards: ["read", "write", "mcp__*"]

agents.profiles.{name}.tools_denied

array

default:"[]"

Tool names explicitly blocked for this profile.

agents.profiles.{name}.system_prompt_file

string

default:""

Workspace-relative path to a custom identity file.Example: agents/researcher.md

Example Profiles

{
  "agents": {
    "profiles": {
      "researcher": {
        "model": "openrouter/google/gemini-flash-2.0",
        "tools_allowed": ["web_search", "read", "write"],
        "system_prompt_file": "agents/researcher.md"
      },
      "coder": {
        "model": "anthropic/claude-sonnet-4",
        "tools_denied": ["web_search"],
        "max_tool_iterations": 20
      },
      "safe_agent": {
        "tools_denied": ["bash", "shell"]
      }
    }
  }
}

Using Profiles

# Use a specific profile
grip chat --profile researcher

# Environment variable
export GRIP_PROFILE=coder
grip chat

CLI Overrides

All agent settings can be overridden via CLI flags:

# Override model
grip chat --model "anthropic/claude-opus-4-6"

# Override temperature
grip chat --temperature 0.9

# Set max iterations
grip chat --max-iterations 10

# Dry run mode
grip chat --dry-run

Best Practices

Choosing Models

Development/Testing: Use faster, cheaper models like gemini-flash or gpt-4o-mini
Production: Use claude-sonnet-4 for balanced performance
Complex Tasks: Use claude-opus-4 for architecture and refactoring
Cost Optimization: Enable model tiers to route automatically

Memory Management

Default memory_window=50 works for most conversations
Increase to 100-200 for complex, long-running tasks
Enable auto_consolidate to prevent context overflow
Use a cheap consolidation_model to reduce costs

Safety Controls

Set max_tool_iterations for untrusted environments
Use dry_run=true to test agent behavior
Create profiles with tools_denied for restricted agents
Monitor max_daily_tokens to control costs

Getting Started

Core Concepts

Channels

Features

Configuration

Deployment

Advanced

Agent Configuration

Agent Defaults

Model Selection

Generation Parameters

Execution Control

Memory and Context

Caching and Rate Limiting

Workspace

Model Tiers (Cost-Aware Routing)

Example Configuration

Agent Profiles

Example Profiles

Using Profiles

CLI Overrides

Best Practices

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Channels

Features

Configuration

Deployment

Advanced

​Agent Defaults

​Model Selection

​Generation Parameters

​Execution Control

​Memory and Context

​Caching and Rate Limiting

​Workspace

​Model Tiers (Cost-Aware Routing)

​Example Configuration

​Agent Profiles

​Example Profiles

​Using Profiles

​CLI Overrides

​Best Practices

Build docs developers (and LLMs) love

Agent Defaults

Model Selection

Generation Parameters

Execution Control

Memory and Context

Caching and Rate Limiting

Workspace

Model Tiers (Cost-Aware Routing)

Example Configuration

Agent Profiles

Example Profiles

Using Profiles

CLI Overrides

Best Practices