Administrative Tools
yb-admin
Theyb-admin utility is the primary command-line tool for administering YugabyteDB clusters. It invokes yb-master and yb-tserver servers to perform administrative operations.
Basic Syntax:
--master_addresses: Comma-separated list of YB-Master hosts and ports (default: localhost:7100)--init_master_addrs: Single YB-Master address for discovery--timeout_ms: RPC timeout in milliseconds (default: 60000)--certs_dir_name: Directory with certificates for secure connections
yb-ctl
Local cluster control utility for development and testing environments:Cluster Management
List Cluster Resources
List all tablet servers:Tablet Management
List tablets for a table:Master Configuration Changes
Add a new master:Tablet Server Management
Change tablet configuration:- < 15 minutes: Node catches up through RPC calls
- > 15 minutes: Node goes through remote bootstrap
- For permanent removal: Use REMOVE_SERVER, then ADD_SERVER for replacement
Table Operations
Table Information
List all tables:Tablet Splitting
YugabyteDB automatically splits tablets based on size thresholds. Key configuration: tablet_split_size_threshold_bytes: Controls when tablets are split (propagated via YB-Master heartbeats) The splitting process:- YB-Master monitors tablet sizes via heartbeats
- Tablets exceeding threshold are marked for splitting
- YB-Master registers two new post-split tablets
- Split operation executes on leader tablet-peer
- Old tablet remains available until all replicas complete split
Diagnostic Tools
yb-ts-cli
Tablet server diagnostic utility:Check for Failed Tablets
Use the helper script to identify failed tablets:- Lists all tablet servers
- Checks tablet state on each server
- Identifies tablets in FAILED state
- Provides tombstone commands for cleanup
File Locations
Standard Paths (YugabyteDB Anywhere installations)
Software and binaries:Administrative Scripts
Log Cleanup
Manage disk space used by logs:--logs_disk_percent_max: Max percentage of disk for logs (default: 10%)--postgres_max_log_size: Max size for postgres logs in MB (default: 100MB)--cores_disk_percent_max: Max percentage for core dumps (default: 10%)--logs_purge_threshold: Threshold in GB before purging (default: 10GB)--gzip_only: Only compress files, don’t purge
- Compresses old log files
- Removes oldest compressed logs when threshold exceeded
- Manages core dump file retention
- Preserves the most recent log file
Bulk Load Operations
For bulk loading data into production clusters:- Copies generated SSTable files to production cluster
- Distributes files to all tablet replicas
- Creates staging directories automatically
- Verifies tablet locations on remote servers
Cluster Control Script (ybcontrol.py)
For managing multi-node clusters:- Remote cluster operations via SSH
- Start/stop services across nodes
- Version management and deployment
- Rolling operations support
Security Considerations
Secure Connections
For TLS-enabled clusters, always specify certificates:Decommissioning Nodes
Properly remove nodes to maintain cluster health:- Mark node for decommissioning
- Allow cluster to re-replicate data
- Verify no tablets remain on node
- Remove from master configuration
- Shut down services
Best Practices
Daily Operations
- Monitor cluster health via master UI (http://master-ip:7000)
- Check tablet server status (http://tserver-ip:9000)
- Review logs for warnings and errors
- Verify replication is up to date
- Monitor disk usage and run cleanup as needed
Scheduled Maintenance
- Log rotation: Run log_cleanup.sh via cron
- Tablet health checks: Monitor for FAILED tablets
- Backup verification: Ensure backups complete successfully
- Performance metrics: Track latency and throughput trends
- Certificate rotation: Update TLS certificates before expiry
Emergency Procedures
- Node failure: Allow automatic recovery (15 min grace period)
- Disk full: Run immediate log cleanup, then expand storage
- Failed tablets: Use yb-check-failed-tablets.sh to identify and tombstone
- Split brain: Check master quorum and network connectivity
- Performance degradation: Check slow query logs and compaction status
Next Steps
- Backup and Restore - Protect your data
- Performance Tuning - Optimize cluster performance
- Monitoring - Set up comprehensive monitoring
- Troubleshooting - Resolve common issues

