CentOS 上 RabbitMQ 故障恢复实操手册
一 快速恢复流程
sudo systemctl status rabbitmq-serversudo journalctl -u rabbitmq-server -n 50 --no-pagerrabbitmqctl status、rabbitmqctl cluster_statussudo systemctl restart rabbitmq-serversudo rabbitmqctl stop → sudo rabbitmq-server -detachedsudo rabbitmq-plugins list | grep rabbitmq_managementsudo rabbitmq-plugins enable rabbitmq_management二 常见故障与修复要点
sudo netstat -tulnp | grep -E '5672|15672'hostname、ifconfig 与 /etc/hosts192.168.1.10 rabbitmq1,保存后重启服务。rabbitmqctl list_users → 必要时新增管理员并赋权:
rabbitmqctl add_user admin StrongPass!rabbitmqctl set_user_tags admin administratorrabbitmqctl set_permissions -p "/" admin ".*" ".*" ".*"rabbitmqctl join_cluster 报 nodedown、或 epmd error: timeoutsudo firewall-cmd --add-port={4369,25672,5672,15672}/tcp --permanent && sudo firewall-cmd --reloadtelnet 目标主机 4369、telnet 目标主机 25672。三 集群场景的恢复策略
rabbitmq.conf 的 cluster_partition_handling 配置):
rabbitmqctl forget_cluster_node --offline <故障节点名>forget_cluster_node 清理后再加入。四 数据保护与高可用建议
vm_memory_high_watermark),消费能力不足时扩容消费者或优化处理速率,避免触发流控导致吞吐骤降。