Ubuntu上配置HDFS高可用 HA 实操指南
一 架构与前置准备
sudo ufw disable 或按需放行。二 安装与配置 ZooKeeper
tickTime=2000、dataDir=/var/lib/zookeeper、clientPort=2181server.1=ubuntu1:2888:3888、server.2=ubuntu2:2888:3888、server.3=ubuntu3:2888:3888bin/zkServer.sh startbin/zkServer.sh status(应见到 leader/follower)。三 配置 Hadoop 与 HDFS HA
scp 分发到所有节点,保持版本与配置一致。export JAVA_HOME=...。<property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property><property><name>ha.zookeeper.quorum</name><value>ubuntu1:2181,ubuntu2:2181,ubuntu3:2181</value></property><property><name>dfs.nameservices</name><value>mycluster</value></property><property><name>dfs.ha.namenodes.mycluster</name><value>nn1,nn2</value></property><property><name>dfs.namenode.rpc-address.mycluster.nn1</name><value>ubuntu1:8020</value></property><property><name>dfs.namenode.rpc-address.mycluster.nn2</name><value>ubuntu2:8020</value></property><property><name>dfs.namenode.http-address.mycluster.nn1</name><value>ubuntu1:9870</value></property><property><name>dfs.namenode.http-address.mycluster.nn2</name><value>ubuntu2:9870</value></property><property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://ubuntu3:8485;ubuntu4:8485;ubuntu5:8485/mycluster</value></property><property><name>dfs.journalnode.edits.dir</name><value>/data/hadoop/journal</value></property><property><name>dfs.client.failover.proxy.provider.mycluster</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property><name>dfs.ha.fencing.methods</name><value>sshfence</value></property><property><name>dfs.ha.fencing.ssh.private-key-files</name><value>/home/hadoop/.ssh/id_rsa</value></property><property><name>dfs.ha.automatic-failover.enabled</name><value>true</value></property>scp -r $HADOOP_HOME/etc/hadoop ubuntu{2..5}:$HADOOP_HOME/etc/source /etc/profile 使环境变量生效。四 启动与验证 HA
zkServer.sh starthdfs --daemon start journalnodehdfs namenode -formathdfs namenode -initializeSharedEditshdfs --daemon start namenodehdfs namenode -bootstrapStandbyhdfs zkfc -formatZKhdfs --daemon start zkfchdfs --daemon start datanodestart-dfs.shhdfs haadmin -getServiceState nn1、hdfs haadmin -getServiceState nn2hdfs dfsadmin -report(检查 Live Nodes 与容量)http://ubuntu1:9870、http://ubuntu2:9870jps 查 PID 后 kill -9 <PID>五 常见问题与排障要点
ufw disable 验证。rm -rf /data/hadoop/journal/mycluster/*/home/hadoop/dfs/data/*)后重启。sshfence 需配置私钥路径且可无口令登录对端。