Linux Performance Optimization

Introduction to Performance Optimization

Linux performance optimization is a systematic approach to identifying bottlenecks and improving system efficiency. Performance issues can stem from CPU, memory, disk I/O, or network resources.

Performance Fundamentals

Performance optimization involves:

Selecting metrics to evaluate system performance
Setting performance goals for applications and systems
Performing baseline tests to establish benchmarks
Analyzing performance to locate bottlenecks
Optimizing system and application configurations
Monitoring and alerting for ongoing performance issues

Key Performance Indicators

Two core metrics from application perspective:

Throughput - How many requests the system can handle
Latency - How fast the system responds to requests

From system resource perspective:

Utilization - Percentage of resource capacity used
Saturation - Degree of resource overload
Errors - Number of error events

Understanding Average Load

What is Average Load?

Average load represents the average number of processes in runnable and uninterruptible states over time.

Runnable state (R): Process using CPU or waiting for CPU
Uninterruptible state (D): Process in critical kernel operations (usually I/O)

# Check system load
uptime
02:34:03 up 2 days, 20:14, 1 user, load average: 0.63, 0.83, 0.88

The three numbers represent average load over:

Last 1 minute: 0.63
Last 5 minutes: 0.83
Last 15 minutes: 0.88

Interpreting Load Average

# Check CPU count
grep 'model name' /proc/cpuinfo | wc -l

Ideal load: Load average equals CPU count (100% utilization) Load interpretation (for 2-CPU system):

Load = 2.0: Perfect utilization (100%)
Load = 1.0: 50% utilization
Load = 4.0: System overloaded (200%)

Rule of thumb: Investigate when load exceeds 70% of CPU count

Load Trends

Stable load: All three values similar (0.63, 0.60, 0.65)
Decreasing load: 1-min < 15-min (0.63, 0.83, 1.20)
Increasing load: 1-min > 15-min (1.50, 0.83, 0.60)

CPU Performance Analysis

CPU Context Switching

Context switching occurs when CPU switches from one task to another, requiring:

Saving current task’s state (registers, program counter)
Loading next task’s state
Jumping to new execution point

Types of context switches:

Process context switch - Switching between different processes
Thread context switch - Switching between threads
Interrupt context switch - Handling hardware interrupts

Monitoring Context Switches

# System-wide context switches
vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 7005360  91564 818900    0    0     0     0   25   33  0  0 100  0  0

Key columns:

cs - Context switches per second
in - Interrupts per second
r - Runnable processes (run queue length)
b - Blocked processes (uninterruptible sleep)

# Per-process context switches
pidstat -w 5
Linux 4.15.0 (ubuntu) _x86_64_ (2 CPU)
08:18:26  UID   PID   cswch/s nvcswch/s  Command
08:18:31    0     1      0.20      0.00  systemd
08:18:31    0     8      5.40      0.00  rcu_sched

cswch/s - Voluntary context switches (waiting for resources)
nvcswch/s - Non-voluntary context switches (time slice expired)

CPU Usage Scenarios

Scenario 1: CPU-Intensive Process

# Simulate CPU stress
stress --cpu 1 --timeout 600

# Monitor load
watch -d uptime

# Check CPU usage
mpstat -P ALL 5

# Find culprit process
pidstat -u 5 1

Symptoms: High CPU usage, load average increases, low iowait

Scenario 2: I/O-Intensive Process

# Simulate I/O stress
stress -i 1 --timeout 600

# Monitor
mpstat -P ALL 5

Symptoms: High iowait, moderate CPU usage, increased load average

Scenario 3: Too Many Processes

# Simulate with 8 processes on 2 CPUs
stress -c 8 --timeout 600

# Check process wait times
pidstat -u 5 1

Symptoms: High %wait values, severe CPU contention, very high load

Performance Monitoring Tools

Essential Tools

top - Interactive Process Viewer

top

Key information:

CPU usage by process
Memory usage
Load average
Process states

Useful commands in top:

P - Sort by CPU usage
M - Sort by memory usage
k - Kill process
1 - Show individual CPU cores

htop - Enhanced Process Viewer

htop

Features:

Color-coded interface
Mouse support
Process tree view
Easy sorting and filtering

vmstat - Virtual Memory Statistics

# Update every 5 seconds
vmstat 5

# Show 10 samples
vmstat 5 10

Monitors:

Process statistics
Memory usage
Swap activity
I/O statistics
CPU usage

iostat - I/O Statistics

# CPU and device statistics
iostat -x 5

# Specific device
iostat -x sda 5

Key metrics:

%util - Device utilization
await - Average wait time
r/s, w/s - Read/write requests per second

mpstat - Multi-Processor Statistics

# All CPUs, 5-second intervals
mpstat -P ALL 5

Metrics:

%usr - User space CPU usage
%sys - Kernel space CPU usage
%iowait - Waiting for I/O
%idle - Idle CPU

pidstat - Process Statistics

# CPU usage
pidstat -u 5

# Memory usage
pidstat -r 5

# I/O statistics
pidstat -d 5

# Context switches
pidstat -w 5

# Threads
pidstat -t 5

Advanced Monitoring

sar - System Activity Reporter

# CPU usage
sar -u 5 10

# Memory usage
sar -r 5 10

# I/O statistics
sar -b 5 10

# Network statistics
sar -n DEV 5 10

glances - All-in-One Monitor

glances

Shows comprehensive system information:

CPU, memory, disk, network
Process list
Sensors and temperatures
Docker containers

Memory Performance

Memory Analysis

# Memory overview
free -h
              total        used        free      shared  buff/cache   available
Mem:           7.7G        2.1G        3.2G        123M        2.4G        5.2G
Swap:          2.0G          0B        2.0G

Key metrics:

used - Memory used by applications
free - Completely unused memory
buff/cache - Buffer and cache memory (reclaimable)
available - Memory available for applications

Memory Monitoring

# Process memory usage
ps aux --sort=-%mem | head -10

# Detailed memory map
pmap -x PID

# Memory by process
top -o %MEM

Disk I/O Performance

Disk Space Analysis

# Filesystem usage
df -h

# Directory sizes
du -sh /var/log/*

# Find large files
find / -type f -size +100M -exec ls -lh {} \;

I/O Performance

# I/O statistics
iostat -x 5

# Per-process I/O
iotop

# Show I/O activity
pidstat -d 5

Key metrics:

tps - Transactions per second
kB/s - Kilobytes read/written per second
await - Average I/O wait time
%util - Device utilization percentage

Network Performance

Network Monitoring

# Network interfaces statistics
ip -s link

# Active connections
netstat -antp

# Socket statistics
ss -s

# Monitor traffic
iftop
nload

# Bandwidth usage by process
nethogs

Network Testing

# Test latency
ping -c 10 example.com

# Trace route
traceroute example.com

# DNS lookup
dig example.com
nslookup example.com

# Port connectivity
telnet example.com 80
nc -zv example.com 80

Performance Tuning

CPU Optimization

Reduce context switches
- Decrease number of threads
- Optimize I/O operations
- Use asynchronous I/O
Process priority

# Run with lower priority
nice -n 10 command

# Change running process priority
renice -n 5 -p PID

CPU affinity

# Bind process to specific CPUs
taskset -c 0,1 command

# Move running process
taskset -cp 0,1 PID

Memory Optimization

Clear caches (use with caution)

# Clear page cache
echo 1 > /proc/sys/vm/drop_caches

# Clear dentries and inodes
echo 2 > /proc/sys/vm/drop_caches

# Clear all
echo 3 > /proc/sys/vm/drop_caches

Swap management

# Check swap usage
swapon --show

# Adjust swappiness (0-100, lower = less swap)
sysctl vm.swappiness=10

# Make permanent
echo "vm.swappiness=10" >> /etc/sysctl.conf

I/O Optimization

I/O scheduler

# Check current scheduler
cat /sys/block/sda/queue/scheduler

# Change scheduler
echo deadline > /sys/block/sda/queue/scheduler

Read-ahead optimization

# Check current value
blockdev --getra /dev/sda

# Increase read-ahead
blockdev --setra 2048 /dev/sda

Troubleshooting Workflow

Step 1: Identify Symptoms

# Quick overview
uptime
top
free -h
df -h

Step 2: Narrow Down

# Is it CPU?
mpstat -P ALL 5

# Is it memory?
vmstat 5

# Is it disk?
iostat -x 5

# Is it network?
sar -n DEV 5

Step 3: Identify Process

# Find CPU hog
top -o %CPU
pidstat -u 5

# Find memory hog
top -o %MEM
pidstat -r 5

# Find I/O hog
iotop
pidstat -d 5

Step 4: Deep Dive

# Process details
ps aux | grep PID
lsof -p PID
cat /proc/PID/status

# System calls
strace -p PID

# Library calls
ltrace -p PID

Best Practices

Establish baselines - Know your normal performance metrics
Monitor trends - Use time-series data to spot problems early
Test changes - Always benchmark before and after optimizations
Document everything - Keep records of changes and their effects
Automate monitoring - Set up alerts for critical thresholds
Start simple - Use basic tools before moving to advanced ones
Fix bottlenecks - Optimize the slowest component first
Measure impact - Verify that optimizations actually help

Common Performance Issues

High Load Average

Causes:

CPU-intensive processes
I/O bottlenecks
Too many concurrent processes
Insufficient resources

Investigation:

uptime
mpstat -P ALL 5
pidstat -u 5
iostat -x 5

High Memory Usage

Causes:

Memory leaks
Insufficient memory
Large caches
Too many processes

Investigation:

free -h
top -o %MEM
pidstat -r 5

Slow Disk I/O

Causes:

Disk saturation
Wrong I/O scheduler
Insufficient IOPS
Filesystem issues

Investigation:

iostat -x 5
iotop
lsof | grep deleted

Network Bottlenecks

Causes:

Bandwidth saturation
High latency
Packet loss
DNS issues

Investigation:

ping -c 100 target
mtr target
iftop
netstat -s

Performance Analysis Checklist

Conclusion

Performance optimization is an iterative process:

Measure - Gather performance metrics
Analyze - Identify bottlenecks
Optimize - Make targeted improvements
Verify - Confirm improvements
Repeat - Continue optimizing

Remember: Premature optimization is the root of all evil. Always measure before optimizing, and focus on real bottlenecks, not theoretical ones.

Development Tools

Linux & DevOps

Interview Preparation

​Introduction to Performance Optimization

​Performance Fundamentals

​Key Performance Indicators

​Understanding Average Load

​What is Average Load?

​Interpreting Load Average

​Load Trends

​CPU Performance Analysis

​CPU Context Switching

​Monitoring Context Switches

​CPU Usage Scenarios

​Scenario 1: CPU-Intensive Process

​Scenario 2: I/O-Intensive Process

​Scenario 3: Too Many Processes

​Performance Monitoring Tools

​Essential Tools

​top - Interactive Process Viewer

​htop - Enhanced Process Viewer

​vmstat - Virtual Memory Statistics

​iostat - I/O Statistics

​mpstat - Multi-Processor Statistics

​pidstat - Process Statistics

​Advanced Monitoring

​sar - System Activity Reporter

​glances - All-in-One Monitor

​Memory Performance

​Memory Analysis

​Memory Monitoring

​Disk I/O Performance

​Disk Space Analysis

​I/O Performance

​Network Performance

​Network Monitoring

​Network Testing

​Performance Tuning

​CPU Optimization

​Memory Optimization

​I/O Optimization

​Troubleshooting Workflow

​Step 1: Identify Symptoms

​Step 2: Narrow Down

​Step 3: Identify Process

​Step 4: Deep Dive

​Best Practices

​Common Performance Issues

​High Load Average

​High Memory Usage

​Slow Disk I/O

​Network Bottlenecks

​Performance Analysis Checklist

​Conclusion

Build docs developers (and LLMs) love

Introduction to Performance Optimization

Performance Fundamentals

Key Performance Indicators

Understanding Average Load

What is Average Load?

Interpreting Load Average

Load Trends

CPU Performance Analysis

CPU Context Switching

Monitoring Context Switches

CPU Usage Scenarios

Scenario 1: CPU-Intensive Process

Scenario 2: I/O-Intensive Process

Scenario 3: Too Many Processes

Performance Monitoring Tools

Essential Tools

top - Interactive Process Viewer

htop - Enhanced Process Viewer

vmstat - Virtual Memory Statistics

iostat - I/O Statistics

mpstat - Multi-Processor Statistics

pidstat - Process Statistics

Advanced Monitoring

sar - System Activity Reporter

glances - All-in-One Monitor

Memory Performance

Memory Analysis

Memory Monitoring

Disk I/O Performance

Disk Space Analysis

I/O Performance

Network Performance

Network Monitoring

Network Testing

Performance Tuning

CPU Optimization

Memory Optimization

I/O Optimization

Troubleshooting Workflow

Step 1: Identify Symptoms

Step 2: Narrow Down

Step 3: Identify Process

Step 4: Deep Dive

Best Practices

Common Performance Issues

High Load Average

High Memory Usage

Slow Disk I/O

Network Bottlenecks

Performance Analysis Checklist

Conclusion