How to monitor server resources: CPU, memory, disk, and network

A practical guide to server resource monitoring — which metrics actually matter, which tools to use, and how to catch problems before they become outages.

Server monitoring is one of those things everyone agrees is important and nobody wants to set up. The good news: you only need a few metrics to catch 90% of problems before they escalate. The bad news: most default monitoring setups track the wrong things or alert on noise.

The four resources that matter

Every server problem eventually shows up as one of these:

CPU — sustained high usage, I/O wait, or single-core saturation
Memory — running out of RAM, swapping, or OOM killer events
Disk — running out of space or inodes, high I/O latency
Network — bandwidth saturation, packet loss, connection limits

Everything else (load average, context switches, open files) is a symptom of one of these four.

Quick CLI checks

Before installing anything, learn to read the standard Linux tools:

# CPU: top processes, load, I/O wait
top -bn1 | head -20

# Memory: actual usage minus buffers/cache
free -h

# Disk usage: watch for >85% on any partition
df -h

# Disk I/O: look for high await (latency)
iostat -x 1 3

# Network: connections by state
ss -tuln | wc -l
ss -s

# Inodes: often overlooked, same consequences as disk full
df -i

The most common silent killer is inode exhaustion on small-file workloads (email servers, session storage, cache directories).

Setting up automated monitoring

Option 1: Netdata (beginner-friendly, zero config)

# One-line install
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh

Netdata gives you a real-time dashboard with hundreds of metrics, auto-detection of services, and pre-configured alarms. It is the best option if you want monitoring without configuration work.

Option 2: Prometheus + Node Exporter + Grafana (power user)

# Install node_exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.9.0/node_exporter-1.9.0.linux-amd64.tar.gz
tar xzf node_exporter-1.9.0.linux-amd64.tar.gz
sudo mv node_exporter-1.9.0.linux-amd64/node_exporter /usr/local/bin/

# Run as systemd service
sudo tee /etc/systemd/system/node_exporter.service << 'EOF'
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
ExecStart=/usr/local/bin/node_exporter
Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter

This exposes metrics on port 9100 that Prometheus scrapes. Grafana visualises them. More setup, but infinitely customisable.

Option 3: Monit (lightweight process + resource monitoring)

sudo apt install monit

Monit is tiny and watches specific processes. It restarts them if they stop and can alert you. Good as a second layer alongside something like Netdata.

What to alert on

Do not alert on “CPU over 80%”. Alert on:

Metric	Threshold	Why
Disk usage	> 85%	You still have time to react
Disk usage	> 95%	Critical — clear logs or expand immediately
Memory + swap	swap > 1GB AND memory < 10% free	System is thrashing
CPU iowait	> 30% for 5+ minutes	Disk bottleneck, not CPU
Service down	port not listening	Simple, unambiguous
SSL expiry	< 14 days	No excuse for expired certs
Disk inode usage	> 85%	Silent disk-full scenario

Setting up alerts with Monit

# /etc/monit/monitrc
set mailserver smtp.example.com
set alert your@email.com

check filesystem root with path /
  if space usage > 85% then alert
  if inode usage > 85% then alert

check system example.com
  if loadavg (1min) > 8 for 5 cycles then alert
  if memory usage > 90% then alert
  if swap usage > 25% then alert

check process nginx with pidfile /var/run/nginx.pid
  start program = "/usr/bin/systemctl start nginx"
  stop program = "/usr/bin/systemctl stop nginx"
  if failed port 80 protocol http then restart

Disk monitoring specifics

The fastest way to fill a disk on a web server:

Log files — set up logrotate properly
WordPress backup plugins — they store backups in wp-content/uploads/ and never clean up
MySQL binary logs — set expire_logs_days
Session files — PHP sessions never cleaned up
Docker images — run docker system prune -f weekly

# Find what's using space
du -sh /* 2>/dev/null | sort -rh | head -10

# Find large files
find / -type f -size +100M -exec ls -lh {} \; 2>/dev/null

# Check largest directories
du -sh /var/* | sort -rh | head -10

Practical monitoring workflow

Install Netdata or node_exporter on every server
Set up disk space and SSL expiry alerts (the two most common outages)
Add service health checks (is nginx/mysql/php-fpm running?)
Review dashboards weekly — look for trends, not spikes
Add more specific alerts only when you get burnt by something

The goal is not to monitor everything. It is to know about problems before your users do.