Overview
Grip AI is designed for safe self-hosting with built-in security features. This guide covers production deployment best practices, security considerations, and common configurations.
Security Architecture
Grip implements multiple layers of security:
1. Non-Root User
The official Dockerfile creates and runs as a non-root user by default:
# User: grip (UID 1000, GID 1000)
USER grip
This prevents privilege escalation attacks and limits blast radius if the container is compromised.
Never run Grip as root in production. The container is designed to run as UID 1000.
2. Directory Trust Model
Grip restricts file access by default:
- Workspace First: Agent can always access its workspace directory
- Explicit Consent: External directories require explicit trust via
/trust <path>
- Persistent Decisions: Trust settings saved in
workspace/state/trusted_dirs.json
Configure trust mode:
# Prompt before accessing new directories (default, safest)
GRIP_TOOLS__TRUST_MODE="prompt"
# Allow any directory the OS user can access (not recommended for production)
GRIP_TOOLS__TRUST_MODE="trust_all"
# Restrict to workspace only (most restrictive)
GRIP_TOOLS__TRUST_MODE="workspace_only"
3. Shell Command Deny-List
Every shell command is scanned against 50+ dangerous patterns before execution:
- Destructive commands:
rm -rf /, mkfs, dd if=/dev/zero
- System control:
shutdown, reboot, systemctl poweroff
- Credential exfiltration:
cat ~/.ssh/id_rsa, cat .env
- Remote code injection:
curl | bash, wget -O - | sh
This prevents accidental or malicious system damage.
4. Shield Policy (Runtime Threat Feed)
The agent’s system prompt includes a SHIELD.md policy that evaluates actions against active threats:
- Scopes:
prompt, skill.install, tool.call, network.egress, secrets.read, mcp
- Actions:
block, require_approval, log
- Confidence threshold: >= 0.85 for enforcement
Shield policy is stored at workspace/SHIELD.md and can be customized per deployment.
5. Credential Scrubbing
Tool outputs are automatically redacted before storage:
sk-... API keys (OpenAI/Anthropic)
ghp_... GitHub tokens
xoxb-... Slack tokens
Bearer <token> headers
password=... parameters
This prevents credential leakage in logs and session history.
6. API Security
The REST API includes multiple security layers:
# Bearer token authentication
GRIP_GATEWAY__API__AUTH_TOKEN="grip_your_secret_token"
# Rate limiting (per-IP and per-token)
GRIP_GATEWAY__API__RATE_LIMIT_PER_MINUTE=60
GRIP_GATEWAY__API__RATE_LIMIT_PER_MINUTE_PER_IP=30
# Request size limit (1MB default)
GRIP_GATEWAY__API__MAX_REQUEST_BODY_BYTES=1048576
# Disable direct tool execution (disabled by default)
GRIP_GATEWAY__API__ENABLE_TOOL_EXECUTE=false
Security headers are automatically set:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'self'
Production Deployment
Recommended Docker Configuration
version: '3.8'
services:
grip:
image: grip:latest
container_name: grip-production
restart: unless-stopped
user: "1000:1000"
# Network
ports:
- "127.0.0.1:18800:18800"
# Environment
environment:
# Engine
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- GRIP_AGENTS__DEFAULTS__ENGINE=claude_sdk
- GRIP_AGENTS__DEFAULTS__SDK_MODEL=claude-sonnet-4-6
# Channels
- GRIP_CHANNELS__TELEGRAM__ENABLED=true
- GRIP_CHANNELS__TELEGRAM__TOKEN=${TELEGRAM_BOT_TOKEN}
- GRIP_CHANNELS__TELEGRAM__ALLOW_FROM=${TELEGRAM_ALLOWED_USERS}
# Gateway
- GRIP_GATEWAY__HOST=0.0.0.0
- GRIP_GATEWAY__PORT=18800
- GRIP_GATEWAY__API__AUTH_TOKEN=${API_AUTH_TOKEN}
- GRIP_GATEWAY__API__RATE_LIMIT_PER_MINUTE=60
- GRIP_GATEWAY__API__RATE_LIMIT_PER_MINUTE_PER_IP=30
- GRIP_GATEWAY__API__ENABLE_TOOL_EXECUTE=false
# Security
- GRIP_TOOLS__TRUST_MODE=prompt
- GRIP_TOOLS__SHELL_TIMEOUT=60
# Volumes
volumes:
- grip-data:/home/grip/.grip
# Resource limits
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '1.0'
memory: 1G
# Health check
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:18800/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
# Logging
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
volumes:
grip-data:
driver: local
Network Configuration
Local Access Only
Bind to localhost for local-only access:
# Only accessible from the host machine
GRIP_GATEWAY__HOST="127.0.0.1"
# Or in docker-compose ports
ports:
- "127.0.0.1:18800:18800"
External Access with Reverse Proxy
For external access, use a reverse proxy (nginx, Caddy, Traefik) with HTTPS:
server {
listen 443 ssl http2;
server_name grip.example.com;
ssl_certificate /etc/letsencrypt/live/grip.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/grip.example.com/privkey.pem;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
# Rate limiting
limit_req_zone $binary_remote_addr zone=grip_limit:10m rate=10r/s;
limit_req zone=grip_limit burst=20 nodelay;
location / {
proxy_pass http://127.0.0.1:18800;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support for SSE
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 86400;
}
}
Never expose Grip directly to the internet without a reverse proxy. Always use HTTPS and rate limiting.
Resource Management
Memory Limits
Grip’s memory usage scales with session count and context size:
- Minimum: 512MB (basic operation)
- Recommended: 1-2GB (production with caching)
- Large deployments: 4GB+ (many concurrent sessions)
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 1G
CPU Limits
CPU usage spikes during:
- LLM API calls (minimal, mostly network I/O)
- Message consolidation (LLM-based summarization)
- Tool execution (especially shell commands)
deploy:
resources:
limits:
cpus: '2.0'
reservations:
cpus: '0.5'
Disk Space
Monitor these directories:
| Path | Typical Size | Notes |
|---|
~/.grip/sessions/ | 10-100MB | Session history (JSON files) |
~/.grip/workspace/ | Varies | Agent workspace files |
~/.grip/logs/ | 50-200MB | Application logs (if enabled) |
Implement log rotation:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Monitoring & Observability
Health Checks
Grip provides two health endpoints:
# Public health check (no auth, for load balancers)
curl http://localhost:18800/health
# Response: {"status": "ok"}
# Authenticated health check (with version and uptime)
curl -H "Authorization: Bearer grip_your_token" \
http://localhost:18800/api/v1/health
# Response: {"status": "ok", "version": "0.1.0", "uptime": 3600}
Metrics
Query runtime metrics:
curl -H "Authorization: Bearer grip_your_token" \
http://localhost:18800/api/v1/metrics
Returns:
- Request counts
- Token usage
- Session counts
- Tool execution stats
- Error rates
Logging
Grip logs to stdout/stderr by default. Configure structured logging:
# View logs
docker logs grip-production
# Follow logs
docker logs -f grip-production
# Filter by level
docker logs grip-production 2>&1 | grep ERROR
OpenTelemetry (Optional)
Enable tracing for observability:
# Install with observability extra
uv sync --extra observe
# Configure OTEL endpoint
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
Backup & Recovery
What to Back Up
# Configuration
~/.grip/config.json
# Session data
~/.grip/sessions/
# Workspace files
~/.grip/workspace/
# Trust decisions
~/.grip/workspace/state/trusted_dirs.json
# Memory & history
~/.grip/workspace/MEMORY.md
~/.grip/workspace/HISTORY.md
Backup Script
#!/bin/bash
set -e
BACKUP_DIR="/backups/grip/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"
# Stop container gracefully
docker stop grip-production
# Copy data
sudo cp -r /var/lib/docker/volumes/grip-data/_data "$BACKUP_DIR/"
# Start container
docker start grip-production
# Compress backup
tar -czf "$BACKUP_DIR.tar.gz" "$BACKUP_DIR"
rm -rf "$BACKUP_DIR"
echo "Backup complete: $BACKUP_DIR.tar.gz"
Restore
# Stop container
docker stop grip-production
# Extract backup
tar -xzf backup-20260228-120000.tar.gz
# Restore volume
sudo rm -rf /var/lib/docker/volumes/grip-data/_data
sudo cp -r backup-20260228-120000/ /var/lib/docker/volumes/grip-data/_data
# Fix permissions
sudo chown -R 1000:1000 /var/lib/docker/volumes/grip-data/_data
# Start container
docker start grip-production
Updates
Update Strategy
-
Pull latest image:
-
Stop current container:
docker stop grip-production
-
Backup data (see Backup & Recovery above)
-
Start new container:
-
Verify health:
docker logs grip-production
curl http://localhost:18800/health
Rolling Updates
For zero-downtime updates, run multiple instances behind a load balancer:
services:
grip-1:
# ... config ...
grip-2:
# ... config ...
nginx:
image: nginx:alpine
ports:
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- grip-1
- grip-2
Update one instance at a time:
docker-compose up -d --no-deps grip-1
# Wait for health check
docker-compose up -d --no-deps grip-2
Troubleshooting
Container Won’t Start
# Check logs
docker logs grip-production
# Verify environment variables
docker exec grip-production env | grep GRIP_
# Validate config
docker exec grip-production grip config show
Permission Errors
# Fix volume permissions
sudo chown -R 1000:1000 ~/.grip
# Or in volume
sudo chown -R 1000:1000 /var/lib/docker/volumes/grip-data/_data
API Not Responding
# Check if service is listening
docker exec grip-production netstat -tlnp | grep 18800
# Test health endpoint
docker exec grip-production curl http://localhost:18800/health
# Check firewall
sudo ufw status
sudo iptables -L
High Memory Usage
# Check session count
curl -H "Authorization: Bearer grip_your_token" \
http://localhost:18800/api/v1/sessions | jq 'length'
# Clear old sessions
docker exec grip-production rm -rf /home/grip/.grip/sessions/*
# Restart container
docker restart grip-production
Security Checklist
Review this checklist before deploying to production:
Next Steps