Debian 分卷资源监控实用指南
一 核心概念与范围
二 快速上手命令
df -hTdf -idu -sh /var/logncdu /var(需安装:sudo apt install ncdu)lsblksudo fdisk -l 或 sudo parted -ltop 或 htop(sudo apt install htop)free -hvmstat 1(关注 wa 列,I/O 等待)三 深入 I O 与进程级监控
sudo apt install sysstat iotop dstatiostat -x 1vmstat 1:关注 wa(I/O 等待占比)、bi/bo(块设备读写)。sudo iotop:实时查看每个进程的读写速率与命令。dstat -d --disk-util:同时观察多块盘的利用率与吞吐。sar -b 1 5(I/O 传输速率)、sar -x 1 5(设备级详细统计)cat /proc/diskstats:内核级块设备原始计数(适合脚本化分析)四 自动化巡检与告警
check_mount.sh 并赋权:chmod +x check_mount.sh./check_mount.sh /dev/mapper/vg0-root 85 95#!/usr/bin/env bash
set -euo pipefail
if [ $# -ne 3 ]; then
echo "Usage: $0 <mount_point_or_device> <warn%> <crit%>"
exit 3
fi
mp="$1"; warn="$2"; crit="$3"
# 兼容设备路径与挂载点
if [[ -b "$mp" ]]; then
used=$(df -P "$mp" | awk 'NR==2{gsub(/%/,"",$5); print $5}')
else
used=$(df -P "$mp" | awk 'NR==2{gsub(/%/,"",$5); print $5}')
fi
if [ "$used" -ge "$crit" ]; then
echo "CRITICAL: $mp usage ${used}% (threshold ${crit}%)"
exit 2
elif [ "$used" -ge "$warn" ]; then
echo "WARNING: $mp usage ${used}% (threshold ${warn}%)"
exit 1
else
echo "OK: $mp usage ${used}%"
exit 0
fi
* * * * * root df -h >> /var/log/df.log 2>&1
/etc/systemd/system/disk-check.service[Unit]
Description=Disk usage check
[Service]
Type=oneshot
ExecStart=/usr/local/bin/check_mount.sh / 85 95
/etc/systemd/system/disk-check.timer[Unit]
Description=Run disk check every 5 minutes
[Timer]
OnCalendar=*:0/5
AccuracySec=1min
Persistent=true
[Install]
WantedBy=timers.target
sudo systemctl daemon-reload && sudo systemctl enable --now disk-check.timerbash <(curl -Ss https://my-netdata.io/kickstart.sh),访问 http://服务器IP:19999)sudo apt install munin munin-node)、Zabbix、Prometheus + Grafana五 针对 LVM 与物理盘的实用建议
lsblk 查看 LVM 逻辑卷 → 卷组 → 物理卷 的层级与挂载点,再用 iostat -x 1 观察对应物理盘(如 /dev/sdX)的 %util/await,避免只看挂载点而误判瓶颈位置。df -h 与 df -i 同时巡检,很多“磁盘满”实际是 inode 耗尽。ncdu 快速找出大目录/大文件,再决定清理或扩容策略。sudo apt install smartmontools;sudo smartctl -a /dev/sda),与容量/I-O 告警配合,降低突发故障风险。