【碎碎念】CentOS 7.9 -- Hadoop集群安装

142 阅读1分钟

image.png

开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第4天,点击查看活动详情

前言

JAVA

hadoop环境需要java作为依赖,因此首先安装java,使用如下命令进行yum java1.8安装

yum -y install java-1.8.0-openjdk java-1.8.0-openjdk-devel

image.png

编辑配置文件

vi /etc/profile

在文件末尾追加如下配置

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar 
export PATH=$PATH:$JAVA_HOME/bin

刷新环境变量

source /etc/profile

验证java是否安装成功

java

image.png

java -version

image.png

Hadoop集群部署

关闭防火墙

systemctl stop firewalld

查看是否关闭成功

systemctl status firewalld

image.png

免密登录

ssh-keygen -t rsa -P ''

cat id_rsa.pub

vi authorized_keys

# 将三台机器的公钥拷贝到上述文件,并分发至各个服务器的/root/.ssh/

测试免密登录

下载hadoop

清华源链接

image.png

修改主机名称以及配置hostname

vi /etc/hosts
  • 加入如下主机名称与IP地址
172.28.160.76 host01
172.28.160.77 host02
172.28.160.78 host03
  • 重启网络使服务生效
service network restart
  • 报错:

image.png

  • 解决 备份ifcfg-ens18 并删除,再次尝试重启网络
service network restart
service network restart
  • 网络重启成功,检测修改的hosts是否生效

image.png

  • 确保各个机器都可以ping通各个host01-03主机

安装hodoop

  • 将下载好的hadoop压缩包上传至各个服务器并进行分发,并进行解压
tar -zxvf hadoop-2.7.7.tar.gz
  • 在hadoop-2.7.7目录下创建文件夹【三台机器相同操作】
mkdir  /root/hadoop-2.7.7/hdfs
mkdir  /root/hadoop-2.7.7/hdfs/tmp
mkdir  /root/hadoop-2.7.7/hdfs/name
mkdir  /root/hadoop-2.7.7/hdfs/data

image.png

修改core-site.xml文件

  • 修改/root/hadoop-2.7.7/etc/hadoop/core-site.xml
  • configuration结构中配置如下:
<configuration>
    <property>
       <name>hadoop.tmp.dir</name>
       <value>/root/hadoop-2.7.7/hdfs/tmp</value>
       <description>Abase for other temporary directories.</description>
    </property>
    <property>
         <name>io.file.buffer.size</name>
         <value>131072</value>
    </property>
    <property>
         <name>fs.defaultFS</name>
         <value>hdfs://host01:9000</value>
    </property>
</configuration>

修改hadoop-env.sh文件

修改export JAVA_HOME=JAVA安装目录

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64

修改hdfs-site.xml文件

 <property>
   <name>dfs.namenode.name.dir</name>
 <value>/root/hadoop-2.7.7/hdfs/name</value>
   <final>true</final>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>/root/hadoop-2.7.7/hdfs/data</value>
   <final>true</final>
 </property>
 <property>
  <name>dfs.namenode.secondary.http-address</name>
   <value>host01:9001</value>
 </property>
 <property>
   <name>dfs.webhdfs.enabled</name>
   <value>true</value>
 </property>
 <property>
   <name>dfs.permissions</name>
   <value>false</value>
 </property>

复制mapred-site.xml.template 为mapred-site.xml,并添加如下行

    <property> 
        <name>mapreduce.framework.name</name>
        <value>yarn</value> 
    </property>

修改slaves文件

该步骤只需要修改主节点,将localhost修改为hosts名称

host01
host02
host03

修改yarn-site.xml

<property>
     <name>yarn.resourcemanager.hostname</name>
     <value>host01</value>
</property>
<property>
     <description>The address of the applications manager interface in the RM.</description>
     <name>yarn.resourcemanager.address</name>
     <value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
     <description>The address of the scheduler interface.</description>
     <name>yarn.resourcemanager.scheduler.address</name>
     <value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
     <description>The http address of the RM web application.</description>
     <name>yarn.resourcemanager.webapp.address</name>
     <value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
     <description>The https adddress of the RM web application.</description>
     <name>yarn.resourcemanager.webapp.https.address</name>
     <value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
     <name>yarn.resourcemanager.resource-tracker.address</name>
     <value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
     <description>The address of the RM admin interface.</description>
     <name>yarn.resourcemanager.admin.address</name>
     <value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
</property>
<property>
     <name>yarn.scheduler.maximum-allocation-mb</name>
     <value>8182</value>
     <discription>每个节点可用内存,单位MB,默认8182MB</discription>
</property>
<property>
     <name>yarn.nodemanager.vmem-pmem-ratio</name>
     <value>2.1</value>
</property>
<property>
     <name>yarn.nodemanager.resource.memory-mb</name>
     <value>2048</value>
     </property>
<property>
     <name>yarn.nodemanager.vmem-check-enabled</name>
     <value>false</value>
</property>

初始化namenode

cd /root/hadoop-2.7.7/bin
./hadoop namenode -format

image.png

image.png

启动hadoop

/root/hadoop-2.7.7/sbin

./start-all.sh

image.png

验证服务

http://172.28.160.76:50070/dfshealth.html#tab-overview

image.png

image.png