假设搭建1主2从+1见证节点的高可用集群,节点信息如下:
sudo apt update && sudo apt upgrade -y。/etc/hosts中添加节点映射:192.168.254.110 u24-pg-110
192.168.254.111 u24-pg-111
192.168.254.112 u24-pg-112
192.168.254.113 u24-pg-113
su - postgres
ssh-keygen -t rsa # 按回车生成默认密钥
ssh-copy-id u24-pg-110 # 依次复制到所有节点
ssh-copy-id u24-pg-111
ssh-copy-id u24-pg-112
ssh-copy-id u24-pg-113
# 测试免密登录
ssh u24-pg-110 hostname # 应返回u24-pg-110
在所有节点执行以下命令安装PostgreSQL(以14为例)和repmgr工具:
sudo apt install -y postgresql postgresql-contrib
sudo apt install -y libcurl4-openssl-dev libjson-c-dev flex libselinux1-dev libzstd-dev liblz4-dev libxslt1-dev libxml2-dev libpam0g-dev libssl-dev libkrb5-dev zlib1g-dev libreadline-dev
tar -xf repmgr-5.5.0.tar.gz # 下载对应版本的repmgr源码
cd repmgr-5.5.0/
./configure
make
sudo make install
编辑/etc/postgresql/14/main/postgresql.conf,调整以下参数:
listen_addresses = '*' # 允许所有IP连接
wal_level = replica # 启用WAL日志(流复制必需)
max_wal_senders = 10 # 允许的最大WAL发送进程数(从节点数量+1)
wal_keep_size = 128MB # 保留的WAL日志大小(确保从节点能拉取)
archive_mode = on # 开启归档(可选,用于备份)
archive_command = 'test ! -f /postgresql/arch/%f && cp %p /postgresql/arch/%f' # 归档命令(可选)
hot_standby = on # 允许从节点处于热备状态(必需)
编辑/etc/postgresql/14/main/pg_hba.conf,添加以下规则:
# 允许从节点(111、112)连接复制
host replication replicator 192.168.254.111/32 md5
host replication replicator 192.168.254.112/32 md5
# 允许repmgr用户(所有节点)连接
host repmgr repmgr 192.168.254.0/24 trust
# 允许客户端连接(可选)
host all all 0.0.0.0/0 scram-sha-256
切换到postgres用户,执行以下SQL:
su - postgres
psql
CREATE USER replicator WITH REPLICATION LOGIN PASSWORD 'StrongReplicatorPass123!';
CREATE DATABASE repmgr WITH OWNER=repmgr;
ALTER USER repmgr SET search_path TO repmgr, "$user", public;
\q
创建/etc/repmgr/14/repmgr.conf(主节点专属配置):
node_id=1
node_name='primary'
conninfo='host=u24-pg-110 user=repmgr dbname=repmgr'
data_directory='/var/lib/postgresql/14/main'
sudo systemctl restart postgresql # 重启PostgreSQL使配置生效
repmgr -f /etc/repmgr/14/repmgr.conf primary register # 注册主节点
sudo systemctl enable repmgrd
sudo systemctl start repmgrd
在主节点执行(以从节点111为例):
repmgr -f /etc/repmgr/14/repmgr.conf standby clone -h u24-pg-110 -U replicator -d repmgr -R # -R自动创建standby.signal
注:
-R参数会自动在从节点数据目录创建standby.signal文件(标识为从节点),并配置primary_conninfo。
sudo chown -R postgres:postgres /var/lib/postgresql/14/main
编辑/etc/postgresql/14/main/postgresql.conf,确保以下参数(与主节点一致):
listen_addresses = '*'
hot_standby = on
与主节点类似,添加复制连接规则:
host replication replicator 192.168.254.110/32 md5
host repmgr repmgr 192.168.254.0/24 trust
sudo systemctl restart postgresql
repmgr -f /etc/repmgr/14/repmgr.conf standby register # 注册从节点
sudo systemctl enable repmgrd
sudo systemctl start repmgrd
在从节点执行:
psql -U postgres -c "SELECT * FROM pg_stat_replication;" # 查看主节点同步的从节点
psql -U postgres -c "SELECT pg_is_in_recovery();" # 应返回t(表示从节点)
见证节点不存储数据,仅用于仲裁,提升故障转移的可靠性。
与主从节点相同,安装PostgreSQL和repmgr。
创建/etc/repmgr/14/repmgr.conf:
node_id=4
node_name='witness'
conninfo='host=u24-pg-113 user=repmgr dbname=repmgr'
data_directory='/var/lib/postgresql/14/main'
sudo systemctl restart postgresql
repmgr -f /etc/repmgr/14/repmgr.conf witness register -h u24-pg-110 -U replicator
在任意节点执行:
repmgr -f /etc/repmgr/14/repmgr.conf cluster show
输出应显示主节点、从节点和见证节点的状态(如primary、standby、witness)。
sudo systemctl stop postgresql。repmgr -f /etc/repmgr/14/repmgr.conf cluster show,应显示某个从节点升级为primary。sudo systemctl start postgresql,它会自动变为standby。archive_mode=on)和数据库(pg_dumpall)。pg_stat_replication、repmgr cluster show等命令监控集群状态,或集成Prometheus+Granafa实现可视化。