Skip to main content

Command Palette

Search for a command to run...

30 Essential Commands in Linux: Quick Guide for SREs

Updated
3 min read
30 Essential Commands in Linux: Quick Guide for SREs

1. System Observability & Performance

CommandPurposeReal-World SRE Example
top / htopProcess monitoring.htop (to visually identify which CPU core is pinned at 100%).
uptimeSystem load average.uptime (to check if the load average exceeds the number of CPU cores).
vmstatMemory & CPU stats.vmstat 1 5 (report every second to detect context switching or swapping).
iostatDisk I/O utilization.iostat -xz 1 (monitor extended disk latency in real-time).
free -hAvailable RAM & Swap.free -h (quick check before deploying a high-memory container).
sarHistorical performance.sar -u -f /var/log/sa/sa10 (analyze CPU usage from the 10th of the month).

2. Networking & Connectivity

CommandPurposeReal-World SRE Example
ip addrIP configuration.ip addr show eth0 (validate the IP assigned to a specific interface).
ss -tulpnOpen ports & sockets.ss -tulpn
digDNS lookups.dig +short blog.uptodeploy.com (get only the A record IP for a domain).
curl -IvHTTP(S) debugging.curl -Iv https://google.com (inspect TLS handshake and headers).
tracerouteNetwork path routing.traceroute -I 8.8.8.8 (use ICMP to find where the packet drop occurs).
tcpdumpPacket capturing.tcpdump -i eth0 port 443 (sniffing HTTPS traffic for deep debugging).

3. File System & Logs

CommandPurposeReal-World SRE Example
df -hDisk space usage.df -h / (check if the root partition is at 100% capacity).
du -sh *Directory size.du -sh /var/log/* (find which log file is consuming the most space).
lsofList open files.lsof -i :22 (see active users connected via SSH).
tail -fReal-time log follow.tail -f /var/log/syslog (monitor system events as they happen).
grep -rPattern searching.grep -r "error" /var/log/nginx/ (search for errors across all Nginx logs).
findLocate files.find /etc -name "*.conf" (locate all configuration files in /etc).

4. Process Management & Security

CommandPurposeReal-World SRE Example
ps auxList active processes.ps aux --sort=-%mem (list processes by highest RAM consumption).
kill -9Force termination.kill -9 1234 (kill a zombie or hung process with PID 1234).
systemctlManage services.systemctl restart docker (restart the Docker daemon).
journalctl -xeSystemd logs.journalctl -u nginx.service -f (follow logs for a specific service).
sudoSuperuser privileges.sudo visudo (safely edit the sudoers file to manage permissions).
chmod / chownPermissions & Ownership.chown -R www-data:www-data /var/www/html (fix web server permissions).

5. SRE Power Tools

CommandPurposeReal-World SRE Example
straceTrace system calls.strace -p 1234 (debug why a process is stuck or failing).
dmesg -TKernel ring buffer.dmesg -T
awkText processing.awk '{print $1}' access.log (extract only the IPs from an access log).
rsyncEfficient file sync.rsync -avz ./data/ remote:/backup/ (sync data with compression and delta).
opensslSSL/TLS management.openssl x509 -in cert.crt -text -noout (check certificate expiration/details).

More from this blog