主机分配
| host | hostname | OS |
|---|---|---|
| 192.168.2.39 | ops39 | Centos7 |
| 192.168.2.38 | ops38 | Centos7 |
| 192.168.2.33 | ops33 | Centos7 |
功能分配
| ops39 | ops38 | ops33 | |
|---|---|---|---|
| HDFS | NameNode DataNode | SecondaryNameNode DataNode | DataNode |
| YARN | NodeManager | NodeManager | ResourceManager NodeManager |
环境准备
设置主机名
hostnamectl set-hostname ops39
hostnamectl set-hostname ops38
hostnamectl set-hostname ops33
配置 hosts 文件
vim /etc/hosts
192.168.2.33 ops33
192.168.2.38 ops38
192.168.2.39 ops39
时间同步
查看系统时区,设置所有机器在同一时区下, 比如 UTC+8
date -R
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
禁用防火墙
systemctl stop firewalld
systemctl disable firewalld
配置 ssh 免密登录
配置 ssh 免密登陆有两个目的
- 让 nameNode 下发命令给 dataNode
- 让 resourceManager 下发命令给 nodeManager
相对的,我们也需要配置 nameNode 免登陆 nameNode,resourceManager 免登陆 resourceManager。 按照文章开始的功能划分,即
- 配置 ops39 免登陆 ops38、ops33 以及免密登陆自己
- 配置 ops33 免登陆 ops39 、ops38 以及免密登陆自己
集群搭建
下载 hadoop
cd /opt
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
tar -xf hadoop-3.3.1.tar.gz -C /usr/local/
配置 hadoop 环境变量
vim /etc/profile
export HADOOP_HOME=/usr/local/hadoop-3.3.1
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
source /etc/profile
下载 JDK8
cd /opt
wget https://mirrors.tuna.tsinghua.edu.cn/AdoptOpenJDK/8/jdk/x64/linux/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz
tar -xf OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz -C /usr/local
配置 hadoop-env.sh
vim /usr/local/hadoop-3.3.1/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk8u292-b10
配置 core-site.xml
mkdir /usr/local/hadoop-3.3.1/tmp
mkdir /usr/local/hadoop-3.3.1/hdfs_data
vim /usr/local/hadoop-3.3.1/etc/hadoop/core-site.xml
<configuration>
<property>
<!-- 设置 NameNode 的ip及端口,这里的ip也可以换成主机名 -->
<name>fs.defaultFS</name>
<value>hdfs://192.168.2.39:9000</value>
</property>
<property>
<!-- 存放数据的临时目录,注意这里的路径要写绝对路径,并且不要使用 ~ 等符号 -->
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-3.3.1/tmp</value>
</property>
<property>
<!-- 设置 HDFS 数据在本地文件系统上的存储 -->
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop-3.3.1/hdfs_data</value>
</property>
<property>
<!-- 在 Web UI 访问 HDFS 使用的用户名 -->
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
</configuration>
配置 hdfs-site.xml
vim /usr/local/hadoop-3.3.1/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<!-- 设置 SecondaryNameNode 的ip及端口,这里的ip也可以换成主机名-->
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.2.38:9868</value>
</property>
</configuration>
配置 mapred-site.xml
vim /usr/local/hadoop-3.3.1/etc/hadoop/mapred-site.xml
<configuration>
<property>
<!-- 设置 mapreduce 使用 yarn 框架 -->
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置 yarn-site.xml
<configuration>
<property>
<!-- 设置 yarn 的混洗方式为 mapreduce 默认的混洗算法 -->
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<!-- 指定ResourceManager的地址 -->
<name>yarn.resourcemanager.address</name>
<value>192.168.2.33:8032</value>
</property>
<property>
<!-- 指定ResourceManager的地址 -->
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.2.33:8030</value>
</property>
<property>
<!-- 指定ResourceManager的地址 -->
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.2.33:8031</value>
</property>
</configuration>
配置 workers
vim /usr/local/hadoop-3.3.1/etc/hadoop/workers
ops39
ops38
ops33
把配置好的 hadoop 拷贝到其他机器
scp -r /usr/local/hadoop-3.3.1 root@ops38:/usr/local
scp -r /usr/local/hadoop-3.3.1 root@ops33:/usr/local
格式化 HDFS
/usr/local/hadoop-3.3.1/bin/hdfs namenode -format
启动集群
resourceManager所在主机:/usr/local/hadoop-3.3.1/sbin/start-all.sh
FAQ 1
问题: resourcemanager 启动失败
报错日志:
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1699)
Caused by: java.net.BindException: Problem binding to [ops33:8032] java.net.BindException: 无法指定被请求的地址; For more details see: http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:809)
at org.apache.hadoop.ipc.Server.bind(Server.java:640)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1225)
at org.apache.hadoop.ipc.Server.<init>(Server.java:3117)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:1062)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server.<init>(ProtobufRpcEngine2.java:464)
at org.apache.hadoop.ipc.ProtobufRpcEngine2.getServer(ProtobufRpcEngine2.java:371)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:853)
at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
... 17 more
Caused by: java.net.BindException: 无法指定被请求的地址
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:461)
at sun.nio.ch.Net.bind(Net.java:453)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:222)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
at org.apache.hadoop.ipc.Server.bind(Server.java:623)
... 25 more
原因:
如果namenode个resourceManager不在一台机器上的话 那么不能再datanode上启动resourceManager 也就是只能在resourceManager部署的机器上启动
解决:
在 resourceManager 所在服务器 33 执行启动命令
FAQ 2
问题: hdfs web界面创建目录报错
日志:
Permission denied: user=dr.who, access=READ_EXECUTE, inode="/user":root:supergroup:drwx-wx-wx
原因:
dr.who其实是hadoop中http访问的静态用户名,可以在core-default.xml中看到其配置
hadoop.http.staticuser.user=dr.who
另外,通过查看hdfs的默认配置hdfs-default.xml发现hdfs默认是开启权限检查的。
dfs.permissions.enabled=true #是否在HDFS中开启权限检查,默认为true
解决:
在Hadoop的配置文件core-site.xml中增加如下配置:
<!-- 当前用户全设置成root -->
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
FAQ 3
问题: yarn 执行 MapReduce 时,找不到主类
日志:
[2021-06-25 11:57:35.804]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
解决:
在命令行输入:hadoop classpath
[root@ops39 hadoop-3.3.1]# hadoop classpath
/usr/local/hadoop-3.3.1/etc/hadoop:/usr/local/hadoop-3.3.1/share/hadoop/common/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/common/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/*:/usr/local/hadoop-3.3.1/share/hadoop/mapreduce/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn:/usr/local/hadoop-3.3.1/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn/*
把上述输出的值添加到yarn-site.xml文件对应的属性 yarn.application.classpath下面
<property>
<name>yarn.application.classpath</name>
<value>/usr/local/hadoop-3.3.1/etc/hadoop:/usr/local/hadoop-3.3.1/share/hadoop/common/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/common/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/hdfs/*:/usr/local/hadoop-3.3.1/share/hadoop/mapreduce/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn:/usr/local/hadoop-3.3.1/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.3.1/share/hadoop/yarn/*</value>
</property>
重启yarn,重新跑 MapReduce 任务