CentOS 上监控 RabbitMQ 的实用方案
一 基础检查与内置管理界面
rabbitmq-plugins enable rabbitmq_managementhttp://<服务器IP>:15672rabbitmqctl add_user monitor <密码>rabbitmqctl set_user_tags monitor monitoringrabbitmqctl set_permissions -p / monitor ".*" ".*" ".*"rabbitmqctl statusrabbitmqctl list_queues name messages_ready messages_unacknowledged consumersrabbitmqctl list_connections name state recv_oct send_octrabbitmqctl list_exchanges name type durable auto_delete二 Prometheus Grafana 监控(推荐)
rabbitmq-plugins enable rabbitmq_prometheuscurl http://localhost:5552/metricsrabbitmq_queue_messages(队列消息总数)rabbitmq_queue_consumers(消费者数量)rabbitmq_node_mem_used、rabbitmq_node_disk_free(节点资源)rabbitmq_connections(连接数)rabbitmq.yml(指向 5672 管理账号或专用账号)./bin/rabbitmq_exportertargets: ['<exporterIP>:9419']三 Zabbix 监控
/etc/zabbix/scripts/rabbitmq,创建隐藏认证文件 .rab.auth(含 USERNAME/PASSWORD/CONF/LOGLEVEL/LOGFILE/PORT(15672))zabbix_agentd.conf 开启:UnsafeUserParameters=1、Timeout=15、Include=/etc/zabbix/zabbix_agentd.conf.d/*.confServerActive)/var/log/rabbitmq_zabbix.log 无权限,执行:chown zabbix:zabbix /var/log/rabbitmq_zabbix.log四 关键指标与告警建议
rabbitmq_queue_messages > 1000 持续 5 分钟 触发告警(Warning)rabbitmq_queue_messages_unacknowledged 持续增长或大于阈值rabbitmq_queue_consumers == 0(无消费者)rabbitmq_connections 突增/突降rabbitmq_node_mem_used / rabbitmq_node_disk_free 逼近水位vm_memory_high_watermark.relative = 0.6disk_free_limit.absolute = 1GBrabbitmqctl set_policy ha-all "^ha\\." '{"ha-mode":"all"}'(按需调整匹配规则)。五 快速排障清单
curl 能返回指标,防火墙/安全组策略正确UnsafeUserParameters=1、主动模式 ServerActive、脚本与日志权限、模板与主机关联是否正确。