Hadoop 集群搭建

368 阅读2分钟

主机分配

hosthostnameOS
192.168.2.39ops39Centos7
192.168.2.38ops38Centos7
192.168.2.33ops33Centos7

功能分配

ops39ops38ops33
HDFSNameNode DataNodeSecondaryNameNode DataNodeDataNode
YARNNodeManagerNodeManagerResourceManager NodeManager

环境准备

设置主机名

hostnamectl set-hostname ops39
hostnamectl set-hostname ops38
hostnamectl set-hostname ops33

配置 hosts 文件

vim /etc/hosts
​
192.168.2.33  ops33
192.168.2.38  ops38
192.168.2.39  ops39

时间同步

查看系统时区,设置所有机器在同一时区下, 比如 UTC+8

date -R
​
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

禁用防火墙

systemctl stop firewalld
systemctl disable firewalld

配置 ssh 免密登录

配置 ssh 免密登陆有两个目的

  1. 让 nameNode 下发命令给 dataNode
  2. 让 resourceManager 下发命令给 nodeManager

相对的,我们也需要配置 nameNode 免登陆 nameNode,resourceManager 免登陆 resourceManager。 按照文章开始的功能划分,即

  • 配置 ops39 免登陆 ops38、ops33 以及免密登陆自己
  • 配置 ops33 免登陆 ops39 、ops38 以及免密登陆自己

集群搭建

下载 hadoop

cd /opt
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
tar -xf hadoop-3.3.1.tar.gz -C /usr/local/

配置 hadoop 环境变量

vim /etc/profile
    export HADOOP_HOME=/usr/local/hadoop-3.3.1
    export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
    
source /etc/profile

下载 JDK8

cd /opt 
wget https://mirrors.tuna.tsinghua.edu.cn/AdoptOpenJDK/8/jdk/x64/linux/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
tar -xf OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz -C /usr/local

配置 hadoop-env.sh

vim /usr/local/hadoop-3.3.1/etc/hadoop/hadoop-env.sh
    export JAVA_HOME=/usr/local/jdk8u292-b10

配置 core-site.xml

mkdir /usr/local/hadoop-3.3.1/tmp
mkdir /usr/local/hadoop-3.3.1/hdfs_data
​
vim /usr/local/hadoop-3.3.1/etc/hadoop/core-site.xml
​
<configuration>
    <property>
        <!-- 设置 NameNode 的ip及端口,这里的ip也可以换成主机名 -->
        <name>fs.defaultFS</name>
        <value>hdfs://192.168.2.39:9000</value>
    </property>
​
    <property>
        <!-- 存放数据的临时目录,注意这里的路径要写绝对路径,并且不要使用 ~ 等符号 -->
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/hadoop-3.3.1/tmp</value>
    </property>
​
    <property>
        <!-- 设置 HDFS 数据在本地文件系统上的存储 -->
        <name>dfs.datanode.data.dir</name>
        <value>/usr/local/hadoop-3.3.1/hdfs_data</value>
    </property>
    <property>
        <!-- 在 Web UI 访问 HDFS 使用的用户名 -->
        <name>hadoop.http.staticuser.user</name>
        <value>root</value>
    </property>
</configuration>

配置 hdfs-site.xml

vim /usr/local/hadoop-3.3.1/etc/hadoop/hdfs-site.xml
​
<configuration>
    <property>
        <!-- 设置 SecondaryNameNode 的ip及端口,这里的ip也可以换成主机名-->
        <name>dfs.namenode.secondary.http-address</name>
        <value>192.168.2.38:9868</value>
    </property>
</configuration>

配置 mapred-site.xml

vim /usr/local/hadoop-3.3.1/etc/hadoop/mapred-site.xml
​
<configuration>
    <property>
        <!-- 设置 mapreduce 使用 yarn 框架 -->
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

配置 yarn-site.xml

<configuration>
    <property>
        <!-- 设置 yarn 的混洗方式为 mapreduce 默认的混洗算法 -->
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
​
    <property>
        <!-- 指定ResourceManager的地址 -->
        <name>yarn.resourcemanager.address</name>
        <value>192.168.2.33:8032</value>
    </property>
    <property>
        <!-- 指定ResourceManager的地址 -->
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>192.168.2.33:8030</value>
    </property>
    <property>
        <!-- 指定ResourceManager的地址 -->
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>192.168.2.33:8031</value>
    </property>
</configuration>

配置 workers

vim /usr/local/hadoop-3.3.1/etc/hadoop/workers
​
    ops39
    ops38
    ops33
​

把配置好的 hadoop 拷贝到其他机器

scp -r /usr/local/hadoop-3.3.1 root@ops38:/usr/local
scp -r /usr/local/hadoop-3.3.1 root@ops33:/usr/local

格式化 HDFS

/usr/local/hadoop-3.3.1/bin/hdfs namenode -format

启动集群

resourceManager所在主机:/usr/local/hadoop-3.3.1/sbin/start-all.sh

FAQ 1

问题: resourcemanager 启动失败

报错日志:

    at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1699)
Caused by: java.net.BindException: Problem binding to [ops33:8032] java.net.BindException: 无法指定被请求的地址; For more details see:  http://wiki.apache.org/hadoop/BindException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:809)
    at org.apache.hadoop.ipc.Server.bind(Server.java:640)
    at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1225)
    at org.apache.hadoop.ipc.Server.<init>(Server.java:3117)
    at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:1062)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server.<init>(ProtobufRpcEngine2.java:464)
    at org.apache.hadoop.ipc.ProtobufRpcEngine2.getServer(ProtobufRpcEngine2.java:371)
    at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:853)
    at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
    at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
    ... 17 more
Caused by: java.net.BindException: 无法指定被请求的地址
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:461)
    at sun.nio.ch.Net.bind(Net.java:453)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:222)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
    at org.apache.hadoop.ipc.Server.bind(Server.java:623)
    ... 25 more

原因:

如果namenode个resourceManager不在一台机器上的话 那么不能再datanode上启动resourceManager 也就是只能在resourceManager部署的机器上启动

解决:

在 resourceManager 所在服务器 33 执行启动命令

FAQ 2

问题: hdfs web界面创建目录报错

日志:

Permission denied: user=dr.who, access=READ_EXECUTE, inode="/user":root:supergroup:drwx-wx-wx

原因:

dr.who其实是hadoop中http访问的静态用户名,可以在core-default.xml中看到其配置

hadoop.http.staticuser.user=dr.who

另外,通过查看hdfs的默认配置hdfs-default.xml发现hdfs默认是开启权限检查的。

dfs.permissions.enabled=true #是否在HDFS中开启权限检查,默认为true

解决:

在Hadoop的配置文件core-site.xml中增加如下配置:

<!-- 当前用户全设置成root -->
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>

FAQ 3

问题: yarn 执行 MapReduce 时,找不到主类

日志:

[2021-06-25 11:57:35.804]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster

解决:

在命令行输入:hadoop classpath

[root@ops39 hadoop-3.3.1]# hadoop classpath
/usr/local/hadoop-3.3.1/etc/hadoop:/usr/local/hadoop-3.3.1/share/hadoop/common/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/common/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/*:/usr/local/hadoop-3.3.1/share/hadoop/mapreduce/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn:/usr/local/hadoop-3.3.1/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn/*

把上述输出的值添加到yarn-site.xml文件对应的属性 yarn.application.classpath下面

    <property>
        <name>yarn.application.classpath</name>
        <value>/usr/local/hadoop-3.3.1/etc/hadoop:/usr/local/hadoop-3.3.1/share/hadoop/common/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/common/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/*:/usr/local/hadoop-3.3.1/share/hadoop/mapreduce/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn:/usr/local/hadoop-3.3.1/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn/*</value>
    </property>

重启yarn,重新跑 MapReduce 任务