CentOS环境下Hadoop集群搭建步骤
192.168.1.101~192.168.1.106),设置网关、DNS(如8.8.8.8),确保网络互通(ping测试)。hadoop01~hadoop06):hostnamectl set-hostname hadoop01。/etc/hosts文件,添加IP与主机名映射(所有节点一致):192.168.1.101 hadoop01
192.168.1.102 hadoop02
192.168.1.103 hadoop03
...(依次添加所有节点)
systemctl stop firewalld;永久禁用:systemctl disable firewalld。setenforce 0;永久禁用:编辑/etc/selinux/config,将SELINUX=enforcing改为SELINUX=disabled。rpm -qa | grep jdk | xargs -n1 rpm -e --nodeps。yum install -y java-1.8.0-openjdk-devel。java -version(需显示1.8.0版本)。/etc/profile,添加:export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export PATH=$PATH:$JAVA_HOME/bin
生效:source /etc/profile。3.3.4版本):wget https://downloads.apache.org/hadoop/core/hadoop-3.3.4/hadoop-3.3.4.tar.gz。/opt目录:tar -zxvf hadoop-3.3.4.tar.gz -C /opt/。mv /opt/hadoop-3.3.4 /opt/hadoop。/etc/profile,添加:export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
生效:source /etc/profile。hadoop01)生成密钥:ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa。ssh-copy-id hadoop01
ssh-copy-id hadoop02
ssh-copy-id hadoop03
...(依次复制到所有节点)
ssh hadoop02(无需输入密码)。进入$HADOOP_HOME/etc/hadoop目录,修改以下核心配置文件:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value> <!-- NameNode RPC地址 -->
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value> <!-- 临时目录 -->
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value> <!-- 数据副本数(根据节点数调整) -->
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop/dfs/name</value> <!-- NameNode元数据目录 -->
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop/dfs/data</value> <!-- DataNode数据目录 -->
</property>
</configuration>
cp mapred-site.xml.template mapred-site.xml。<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value> <!-- 使用YARN作为资源管理器 -->
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value> <!-- ResourceManager所在节点 -->
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value> <!-- Shuffle服务 -->
</property>
</configuration>
$HADOOP_HOME/etc/hadoop/workers,添加所有DataNode节点的主机名:hadoop01
hadoop02
hadoop03
...(依次添加所有DataNode)
hdfs namenode -format
start-dfs.sh
hdfs dfsadmin -report(显示所有DataNode状态)。hadoop01)执行:start-yarn.sh
yarn node -list(显示所有NodeManager状态)。mapred --daemon start historyserver
http://hadoop01:19888。http://hadoop01:9870(查看NameNode、DataNode状态)。http://hadoop01:8088(查看ResourceManager、NodeManager状态)。hdfs dfs -ls /。yarn node -list。jps(Master节点应显示NameNode、ResourceManager;DataNode节点应显示DataNode、NodeManager)。JAVA_HOME、HADOOP_HOME环境变量需保持一致。workers文件中的节点需与/etc/hosts中的映射一致。