CentOS下HDFS数据恢复可按以下方法操作,操作前建议先在测试环境验证:
回收站恢复
core-site.xml中启用回收站:<property><name>fs.trash.interval</name><value>120</value></property>
<property><name>fs.trash.checkpoint.interval</name><value>120</value></property>
hdfs dfs -cp /user/username/.Trash/Current/deleted_file /path/to/restore快照恢复
hdfs dfs -allowSnapshot /pathhdfs fs -createSnapshot /path snapshot_namehdfs dfs -cp /path/.snapshot/snapshot_name/file /path/to/restore工具恢复
hdfs fsck /path -files -blocks -locationshadoop distcp hdfs://source:port/path hdfs://dest:port/path手动恢复(高风险)
sudo systemctl stop hadoop-namenode hadoop-datanodefsimage文件覆盖新集群NameNode的元数据。注意:恢复成功率受数据丢失时间、集群状态影响,优先启用回收站/快照,定期备份数据。