Debian上监控MinIO的可落地方案
一 监控架构与端点
二 快速接入步骤 Debian + Prometheus + Grafana
ufw allow 9000,9001/tcp。scrape_interval: 15s。scrape_configs:
- job_name: 'minio-nodes'
metrics_path: /minio/v2/metrics/nodes
scheme: http
static_configs:
- targets: ['192.0.2.10:9000','192.0.2.11:9000','192.0.2.12:9000']
# 如需 Bearer Token(示例):
# bearer_token: "<your-token>"
# 或使用 relabel 注入 Authorization 头:
# relabel_configs:
# - source_labels: [__address__]
# target_label: __param_target
# - source_labels: [__param_target]
# target_label: __address__
# replacement: 192.0.2.10:9000 # 实际 MinIO 地址
# - target_label: __scheme__
# replacement: http
# - source_labels: []
# target_label: Authorization
# replacement: Bearer <your-token>
- job_name: 'minio-cluster'
metrics_path: /minio/v2/metrics/cluster
static_configs:
- targets: ['192.0.2.10:9000']
- job_name: 'minio-buckets'
metrics_path: /minio/v2/metrics/buckets
static_configs:
- targets: ['192.0.2.10:9000']
三 关键告警规则示例
groups:
- name: MinIO
rules:
- alert: DiskOffline
expr: minio_offline_disks != 0
for: 5m
labels:
severity: page
annotations:
summary: "MinIO 有磁盘离线"
- alert: StorageSpaceExhausted
expr: minio_disk_storage_free_bytes < 10737418240
for: 5m
labels:
severity: warning
annotations:
summary: "MinIO 可用空间不足 10GiB"
- alert: HighRequestLatency
expr: histogram_quantile(0.95, sum(rate(minio_http_requests_duration_seconds_bucket[5m])) by (le, handler, method)) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "MinIO P95 请求延迟 > 1s"
四 主机与应用层面的补充监控
iostat -x 1 观察 await、svctm、util;进程级 I/O 用 iotop。systemctl status minio;实时日志:journalctl -u minio -f。timedatectl status),避免集群/签名校验异常。五 常见问题与排查