Debian上PostgreSQL故障排查清单
一 快速定位与通用检查
sudo systemctl status postgresql 或 sudo systemctl status postgresql@<version>-mainsudo systemctl start postgresql,必要时 sudo systemctl restart postgresql/var/log/postgresql/postgresql-<version>-main.logsudo journalctl -u postgresql -xe 或 tail -f /var/log/syslog/var/run/postgresql/.s.PGSQL.5432ss -lntp | grep 5432 或 netstat -lntp | grep 5432sudo ufw allow 5432/tcpsudo firewall-cmd --add-port=5432/tcp --permanent && sudo firewall-cmd --reloadnc -vz <host> 5432 或 psql -h <host> -p 5432 -U <user> -d <db> -c "SELECT 1"/etc/postgresql/<version>/main/postgresql.conf/etc/postgresql/<version>/main/pg_hba.conffree -h、df -h、top/htopSELECT * FROM pg_tablespaces;sudo -u postgres psql -c "SELECT version();"psql -h <host> -p <port> -U <user> -d <db>二 常见故障场景与处理
postgresql.conf 语法与路径lc_messages 设置不存在);修复:sudo dpkg-reconfigure locales 启用所需 locale,再重启pg_archivecleanup);核查复制槽 SELECT * FROM pg_replication_slots; 并删除废弃槽 SELECT pg_drop_replication_slot('slot_name');EXPLAIN (ANALYZE, BUFFERS) 找瓶颈;按需创建索引、避免 SELECT *、优化 JOIN/子查询pg_stat_statements 定位最耗时语句:shared_preload_libraries='pg_stat_statements'(需重启),查询 SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;SELECT * FROM pg_stat_replication;,关注 state、sent_lsn、write_lsn、flush_lsn、replay_lsn 的延迟与卡点三 日志与性能分析工具
pg_stat_activity:当前连接、查询、等待事件pg_stat_database:提交/回滚、事务、读写统计pg_stat_statements:SQL 执行次数、总/平均耗时、行数postgresql.conf 中设置
log_destination = 'csvlog'logging_collector = onlog_directory = 'pg_log'log_filename = 'postgresql-%Y-%m-%d.log'log_checkpoints = onlog_connections = onlog_disconnections = onlog_lock_waits = onlog_min_duration_statement = 1000(记录超过 1 秒的慢查询)pgbadger /var/log/postgresql/*.log -o report.html,可配合定时任务做周报/日报四 应急与恢复建议
postgresql.conf、pg_hba.conf 与 pg_ident.confSELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE ...;ALTER SYSTEM SET default_transaction_read_only = on; SELECT pg_reload_conf();pg_checksums -c -D /var/lib/postgresql/<version>/main(若启用校验和)pg_check 可用于物理一致性检查(使用需谨慎,优先在测试环境验证)pg_dump -U <user> -h <host> -p <port> -F c -b -v -f backup.dump <db>recovery_target_time 或 recovery_target_lsn 配置 recovery.conf/postgresql.auto.conf 后启动SELECT pg_reload_conf();(无需重启的配置),或 sudo systemctl restart postgresql(需重启的配置)