Overview
The AWX capacity system determines how many jobs can run on an instance based on available memory and CPU resources. Capacity management ensures efficient resource utilization while preventing system overload.Capacity Fundamentals
Capacity is calculated based on:- Memory capacity (
mem_capacity): Available system memory - CPU capacity (
cpu_capacity): Available CPU cores - Forks: Number of simultaneous connections Ansible maintains
How Capacity Works
- Each instance has a calculated capacity based on hardware resources
- Jobs consume capacity based on their “impact” (primarily fork count)
- The task manager assigns jobs to instances with sufficient capacity
- When capacity is exhausted, jobs wait until resources free up
Capacity is not a zero-sum system. If only one instance is available, AWX allows jobs to run even if they exceed capacity, ensuring jobs don’t become permanently blocked.
Capacity Algorithms
Memory-Relative Capacity (Default)
Calculates capacity based on available memory, allowing CPU overcommit:- Reserves 2GB for AWX services
- Default: 100MB per fork (
SYSTEM_TASK_FORKS_MEM) - Best for I/O-bound workloads
- Protects against out-of-memory conditions
CPU-Relative Capacity
Calculates capacity based on CPU cores:- Default: 4 forks per core (
SYSTEM_TASK_FORKS_CPU) - Best for CPU-bound workloads
- Reduces contention for compute resources
Capacity Adjustment
Balance between memory and CPU capacity usingcapacity_adjustment:
0.0: Use minimum (most conservative)0.5: 50/50 balance1.0: Use maximum (most aggressive)
Job Impact
Job Types and Impact
Jobs have two impact types:Control Impact
Fixed:AWX_CONTROL_NODE_TASK_IMPACT (default: 1)Applied to: The instance controlling the job
Execution Impact
Variable: Based on job type| Job Type | Execution Impact | Formula |
|---|---|---|
| Job Templates | forks + 1 | min(forks, host_count) + 1 |
| Ad-hoc Commands | forks + 1 | min(forks, host_count) + 1 |
| Project Updates | 1 | Fixed |
| Inventory Updates | 1 | Fixed |
| System Jobs | 5 | Fixed |
The
+1 accounts for the Ansible parent process that coordinates execution.Impact Examples
Example 1: Hybrid Node (Control + Execution)
Settings:AWX_CONTROL_NODE_TASK_IMPACT=1, forks=5, hosts=3
Example 2: Container Group Job
Settings:AWX_CONTROL_NODE_TASK_IMPACT=1
Example 3: Project Update
Settings:AWX_CONTROL_NODE_TASK_IMPACT=1
Control Node Task Impact
TheAWX_CONTROL_NODE_TASK_IMPACT setting controls how much capacity controlling jobs consumes.
When to Adjust
Increase (AWX_CONTROL_NODE_TASK_IMPACT = 2 or higher):
- Control plane CPU/memory usage is high
- Many concurrent container group jobs
- Job event processing is slow
- Need to throttle concurrent jobs
AWX_CONTROL_NODE_TASK_IMPACT = 0.5 or lower):
- Control plane is underutilized
- Most jobs run on execution nodes
- Want more concurrent job control
Instance Groups
Instance Group Capacity
Instance groups aggregate capacity from member instances. Configure group-wide limits:max_concurrent_jobs
Maximum concurrent jobs across entire group:max_forks
Maximum total forks across entire group:Container Group Capacity Planning
Calculate max_concurrent_jobs
Based on pod resource requests:Calculate max_forks
Based on Ansible memory usage (100MB per fork):max_forks=81:
- 81 jobs with 1 fork each, OR
- 40 jobs with 2 forks each, OR
- 2 jobs with 40 forks each
Capacity Monitoring
Check Instance Capacity
Monitor Running Jobs
Instance Group Status
Capacity Optimization
Recommendations by Workload
I/O-Bound Workloads
(Network operations, cloud APIs, service calls)CPU-Bound Workloads
(Computation, template rendering, encryption)Mixed Workloads
Instance Sizing Guidelines
| Instance Type | vCPU | Memory | Est. Capacity | Workload |
|---|---|---|---|---|
| Small | 2 | 4 GB | ~8-20 forks | Dev/test |
| Medium | 4 | 8 GB | ~24-60 forks | Light production |
| Large | 8 | 16 GB | ~56-140 forks | Production |
| XLarge | 16 | 32 GB | ~120-300 forks | Heavy production |
Dedicated Instance Groups
Create dedicated groups for specific workloads:Troubleshooting
Jobs Stuck in Pending
Cause: Insufficient capacity- Add more instances to the group
- Increase instance capacity (add memory/CPU)
- Adjust
capacity_adjustmentto 1.0 - Reduce job fork counts
- Add fallback instance groups
Capacity Calculations Seem Wrong
High Memory Usage
Symptoms: Jobs failing with OOM, system slowness Solutions:- Reduce
SYSTEM_TASK_FORKS_MEM(more conservative) - Use CPU capacity (
capacity_adjustment=0.0) - Reduce
JOB_EVENT_WORKERS - Limit concurrent jobs on instance group
- Add more memory to instances
High CPU Usage
Symptoms: Slow job execution, high load average Solutions:- Reduce
SYSTEM_TASK_FORKS_CPU - Use memory capacity (
capacity_adjustment=1.0) - Add more CPU cores
- Reduce job forks in templates
- Use dedicated execution nodes
Best Practices
- Monitor continuously: Track capacity metrics and adjust as needed
- Start conservative: Begin with lower capacity and increase gradually
- Separate workloads: Use dedicated instance groups for different job types
- Test under load: Simulate production load before going live
- Document changes: Keep records of capacity adjustments and their effects
- Plan for growth: Size instances with 20-30% headroom
- Use execution nodes: Offload job execution from control plane
- Throttle container groups: Set appropriate
max_concurrent_jobslimits
Related Resources
- Configuration - Configure capacity settings
- Metrics - Monitor capacity usage
- Clustering - Instance groups and clustering