在Linux上管理Hadoop集群涉及多个方面,包括安装、配置、监控和维护。以下是一些关键步骤和最佳实践:
tar命令解压下载的文件。tar -xzvf hadoop-3.x.x.tar.gz -C /opt
/etc/profile或~/.bashrc文件,添加Hadoop的环境变量。export HADOOP_HOME=/opt/hadoop-3.x.x
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
$HADOOP_HOME/etc/hadoop目录下的配置文件复制到所有节点。<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-3.x.x/tmp</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop-3.x.x/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop-3.x.x/data/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>resourcemanager</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>jobhistoryserver:10020</value>
</property>
</configuration>
hdfs namenode -format
start-dfs.sh
start-yarn.sh
stop-dfs.sh
stop-yarn.sh
$HADOOP_HOME/logs目录下。通过以上步骤,你可以在Linux上有效地管理和维护Hadoop集群。记住,每个集群的具体需求可能不同,因此需要根据实际情况调整配置。