System overview
The agent package (internal/agent/) orchestrates intelligent context gathering through several specialized subsystems:
Investigation flow
When you runclanker ask "what lambda functions are failing?", the agent follows this execution path:
Semantic analysis
The
semantic.Analyzer performs keyword-based intent classification without external NLP calls. It extracts:- Primary intent (troubleshoot, monitor, analyze)
- Urgency level (critical, high, medium, low)
- Target services (lambda, ecs, s3, etc.)
- Time frame (recent, last_hour, last_day)
- Data types (logs, metrics, status)
internal/agent/semantic/analyzer.go:27Decision tree traversal
The decision tree maps semantic intent to concrete agent types and execution parameters:Nodes are evaluated depth-first. Matching conditions spawn their configured agent types.See
internal/agent/decisiontree/tree.go:31Dependency scheduling
The See
DependencyScheduler groups agents by execution order:- Order 1: Independent collectors (log, metrics, k8s)
- Order 2: Infrastructure agents requiring basic data (infrastructure, deployment)
- Order 3: Analysis agents requiring enriched data (security, queue)
- Order 4+: Higher-order insights (cost, availability)
internal/agent/coordinator/agent_types.go:5Parallel execution
Within each order group, agents run concurrently:Each agent:
- Copies the main context
- Executes AWS operations (CLI calls or SDK methods)
- Stores results in its local
Resultsmap - Publishes promised data to the
SharedDataBus
internal/agent/coordinator/coordinator.go:63Result aggregation
The coordinator merges all successful agent outputs:Metadata includes execution counts, timestamps, and decision path.See
internal/agent/coordinator/coordinator.go:138Context building
The final context string merges:
- Semantic analysis summary
- All parallel agent results (grouped by agent type)
- Service-specific log analysis
- Error patterns and metrics
- Agent reasoning chain (chain of thought)
internal/agent/agent.go:306Core components
Agent orchestrator
TheAgent type in agent.go wires everything together:
- Run semantic analysis
- Traverse decision tree via coordinator
- Spawn parallel agents or fall back to sequential planner
- Build final context for LLM
internal/agent/agent.go:42
Coordinator
TheCoordinator drives dependency-tree-based parallel execution:
Analyze(query string)— traverse decision treeSpawnAgents(ctx, applicable)— launch agents by dependency orderWaitForCompletion(ctx, timeout)— block until all agents finishAggregateResults()— merge successful outputsStats()— execution metrics
internal/agent/coordinator/coordinator.go:34
Shared data bus
TheSharedDataBus stores dependency data produced by agents:
ProvidedData. Downstream agents check RequiredData before executing.
See internal/agent/coordinator/state.go:10
Agent registry
TheAgentRegistry tracks running agents and maintains counters:
Register(agent)— add agent and increment totalMarkCompleted()/MarkFailed()— update countersAgents()— snapshot of all agentsStats()— execution summary
internal/agent/coordinator/state.go:58
Agent types
Clanker includes these built-in specialist agents:Log agent
Log agent
Execution order: 1 (independent)Provides:
logs, error_patterns, log_metricsOperations:- Discover relevant log groups
- Sample recent log entries
- Filter error patterns
- Extract log stream metadata
internal/agent/coordinator/agent_types.go:28Metrics agent
Metrics agent
Execution order: 1 (independent)Provides:
metrics, performance_data, thresholdsOperations:- Query CloudWatch metrics
- Check alarm states
- Aggregate performance data
internal/agent/coordinator/agent_types.go:36Infrastructure agent
Infrastructure agent
Execution order: 2Provides:
service_config, deployment_status, resource_healthOperations:- List EC2, ECS, Lambda resources
- Describe service configurations
- Check deployment status
internal/agent/coordinator/agent_types.go:44Security agent
Security agent
Execution order: 3Requires:
logs, service_configProvides: security_status, access_patterns, vulnerabilitiesOperations:- Analyze IAM policies
- Check security group rules
- Audit access logs
internal/agent/coordinator/agent_types.go:52Cost agent
Cost agent
Execution order: 4Requires:
metrics, resource_healthProvides: cost_analysis, usage_patterns, optimization_suggestionsOperations:- Query Cost Explorer
- Analyze resource utilization
- Generate optimization recommendations
internal/agent/coordinator/agent_types.go:61K8s agent
K8s agent
Execution order: 1 (independent)Provides:
k8s_resources, k8s_healthOperations:- List pods, deployments, services
- Check resource status
- Gather cluster metrics
internal/agent/coordinator/agent_types.go:20Sequential fallback
When the decision tree returns no applicable nodes, the agent falls back to a traditional sequential approach:- Calls an LLM decision function to determine the next action
- Executes the chosen action (gather logs, metrics, etc.)
- Repeats until complete or
maxStepsis reached
internal/agent/planner.go:13
Extending the system
Keep shared structs in
internal/agent/model/ to avoid circular imports. Run gofmt after edits and ensure go build ./... stays green.Performance considerations
Parallelism
Agents in the same execution order run concurrently, reducing total investigation time. Use
--agent-trace to see lifecycle logs.Timeouts
Each agent type has a
WaitTimeout (typically 5-8 seconds). The coordinator waits up to 15 seconds for all agents to complete.Dependency checks
Agents only execute when their dependencies are satisfied on the data bus. This prevents wasted work and ensures data consistency.
Graceful degradation
Failed agents don’t block the pipeline. The coordinator aggregates whatever data is available and proceeds with partial results.
Debugging agent execution
Enable detailed agent tracing:- Decision tree matches and priorities
- Execution order groups
- Agent start/completion events
- Dependency satisfaction checks
- Final aggregation stats
.clanker.yaml
internal/agent/coordinator/coordinator.go:17 for the trace flag check.
Related resources
Debugging
Debug flags and trace output
Backend API
Credential storage and multi-machine sync
Custom profiles
AI provider configuration
Ask command
Natural language queries