环境检测
-
selinux:
# 查看selinux状态,Enforcing强制模式Permissive:宽容模式,Disabled:关闭模式 getenforce Disabled # 如果不是Disabled,将其关闭后重启 vim /etc/sysconfig/selinux # 修改以下内容 SELINUX=disabled # 重启服务 reboot -
防火墙:
# 查看防火墙状态 service firewalld status Redirecting to /bin/systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since 六 2020-08-29 22:27:22 CST; 19min ago Docs: man:firewalld(1) Main PID: 670 (firewalld) CGroup: /system.slice/firewalld.service └─670 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid 8月 29 22:27:11 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon... 8月 29 22:27:22 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon. # 关闭防火墙 service firewalld stop # 关闭防火墙自启动 chkconfig firewalld off 注意:正在将请求转发到“systemctl disable firewalld.service”。 Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
ssh免密登陆
-
在配置免密登陆之前确保网络和hostname配置好了;
-
在主节点生成密钥对,然后将公钥发送至其他节点:
# 生成密钥对,一路回车 ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:6UbeXeQGqhYGaKGeSKl1JhxVuEajiCGC0QDNl+ghJ3U root@node1 The key's randomart image is: +---[RSA 2048]----+ |*BooE+. | |B.O+*o | |+X===o. . . | |=+o*o . . . + | |o o. S . + | | = + . o | | * . . | | o | | | +----[SHA256]-----+ # 将公钥copy至所有节点,切记node1也需要,因为启动脚本是ssh登录执行的,node1上如果没有authorized_keys文件,ssh node1也会需要输入密码 ssh-copy-id -i .ssh/id_rsa.pub root@node1 ssh-copy-id -i .ssh/id_rsa.pub root@node2 ssh-copy-id -i .ssh/id_rsa.pub root@node3 # 尝试免密登陆 ssh node2 # 由于要配置ha,同理需要在node2节点上生成秘钥对,然后将node2的公钥添加至node1的~/.ssh/authorized_keys文件后面即可,不再赘述操作
JDK安装
-
查找是否有自带的JDK:
# 查找 安装的java包 yum list installed | grep java # 如果有将对应的卸载 yum remove -y java-xxxx -
安装JDK:
# 将JDK安装包上次到服务器后进入对应目录 mkdir /usr/java tar -zxvf ./jdk-8u161-linux-x64.tar.gz -C /usr/java/ vim /etc/profile ## 填入以下内容 # JAVA_HOME export JAVA_HOME=/usr/java/jdk1.8.0_161/ export PATH=$PATH:$JAVA_HOME/bin export classpath=.:/lib:$JAVA_HOME:lib # 使配置修改生效 source /etc/profile # 查看是否配置成功 java -version java version "1.8.0_161" Java(TM) SE Runtime Environment (build 1.8.0_161-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
zookeeper安装
-
将资源上传到node1后解压:
mkdir -p /opt/bigdata tar -zxvf zookeeper-3.4.14.tar.gz -C /opt/bigdata/ -
zookeeper配置分发:
cd /opt/bigdata/zookeeper-3.4.14/conf # 复制一份配置文件 cp zoo_sample.cfg zoo.cfg # 在zoo.cfg中修改以下内容: # 数据目录修改成/data 或/var目录下,不要用/tmp dataDir=/data/zookeeper/ # 服务配置 server.1=node1:2888:3888 server.2=node2:2888:3888 server.3=node3:2888:3888 # 三个节点都要做 mkdir -p /data/zookeeper # 各节点写入对应的数字1,2,3 echo 1 > /data/zookeeper/myid # 将配置好的文件分发至其他节点 scp -r /opt/bigdata/zookeeper-3.4.14 root@node2:/opt/bigdata/zookeeper-3.4.14/ -
配置换件变量:
vim /etc/profile # 填入以下内容 # ZOOKEEPER_HOME export ZOOKEEPER_HOME=/opt/bigdata/zookeeper-3.4.14 export PATH=$PATH:$ZOOKEEPER_HOME/bin source /etc/profile -
启动zookeeper服务:
zkServer.sh start # 所有节点启动后查看状态 zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/bigdata/zookeeper-3.4.14/bin/../conf/zoo.cfg Mode: leader -
错误解决:
Cannot open channel to 2 at election address node2/192.168.0.252:3888 java.net.ConnectException: 拒绝连接 (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558) at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435) at java.lang.Thread.run(Thread.java:748)- 确保dataDir目录下有myid文件且里面的数值和zoo.cfg配置一致;
- 确保防火墙已经关闭;
- 确保hostname正确配置;
Hadoop集群安装
-
集群规划,我机器不太好只在三个节点安装,机器好可以自行加节点:
资源\节点 node1 node2 node3 NameNode ✔ ✔ DataNode ✔ ✔ ZKFC ✔ ✔ ResourceManager ✔ ✔ NodeManager ✔ ✔ JournalNode ✔ ✔ ✔ zookeeper ✔ ✔ ✔ -
将资源上传到node1后解压:
tar -zxvf hadoop-2.6.5.tar.gz -C /opt/bigdata/ -
环境变量配置:
vim /etc/profile # 加入一下内容 # HADOOP_HOME export HADOOP_HOME=/opt/bigdata/hadoop-2.6.5 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin source /etc/profile -
进入到
/opt/bigdata/hadoop-2.6.5/etc/hadoop进行配置文件修改,首先是将hadoop-env.sh,mapred-env.sh yarn-env.sh脚本中对应的JAVA_HOME修改为对应的安装路径; -
core-site.xml:<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- 要配置ha,所以直接用hdfs的namespaces,而不是9000端口的配置 --> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>node1:2181,node2:2181,node3:2181</value> </property> </configuration> -
hdfs-site.xml:<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- hdfs ha相关配置,注意修改节点即可 --> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>node1:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>node2:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>node1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>node2:50070</value> </property> <!-- 备份数量,默认为3,我这里只有两个datanode --> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/data/hadoop/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/hadoop/hdfs/data</value> </property> <!-- 以下是JN在哪里启动,数据存那个磁盘 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/data/hadoop/hdfs/jn</value> </property> <!-- HA角色切换的代理类和实现方法,我们用的ssh免密 --> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 开启自动化: 启动zkfc --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration> -
mapred-site.xml<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- mapreduce on yarn --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> -
yarn-site.xml<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 开启yarn ha --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- zookeeper地址 --> <property> <name>yarn.resourcemanager.zk-address</name> <value>node1:2181,node2:2181,node3:2181</value> </property> <!-- yarn ha cluster-id --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>mycluster</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!-- 配置resourcemanager对应节点 --> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>node1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>node2</value> </property> </configuration> -
将配置好后的目录文件发送至其他节点:
scp -r /opt/bigdata/hadoop-2.6.5/ root@node2:/opt/bigdata/hadoop-2.6.5/ -
在node1、node2、node3上手动启动journalnode服务:
hadoop-daemon.sh start journalnode # 查看下是成功启动 jps 19530 JournalNode 21371 Jps -
在node1(或者node2也可以)上格式化namenode元数据:
# 最后看到有出现 successfully formatted字样即可 hdfs namenode -format -
在node2上同步namenode元数据信息:
# 同步后会看到集群ID等信息,组后同样出现successfully formatted字样即表示成功 hdfs namenode -bootstrapstandby -
在node1上格式化zkfc:
# 最后出现 Succefully created /haddop-ha/mycluster in ZK即表示成功 hdfs zkfc -formatZK # 也可以登录ZK查看下根目录下有无hadoop-ha: zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, yarn-leader-election, hadoop-ha] [zk: localhost:2181(CONNECTED) 1] -
启动hdfs和yarn:
# 之前启动了journalnode、namenode等服务,要先停止才能启动集群 stop-dfs.sh start-dfs.sh start-yarn.sh # node2上执行,yarn的ha并不会帮你启动第二个resourcemanager,namnode的话会自己启动 yarn-daemon.sh start resourcemanager -
查看http://node1:50070,出现以下界面即可:
-
其他说明
- 最终查看50070端口和8088端口正常一般就都ok了,我就没去做杀掉某个namenode或者resourcemanager的测试了,需要的可以自行测试;
- 我整个安装过程都是用root用户安装的,装了很多服务(比如后面的hive、spark、habse等)正常使用都会创建很多用户去区分权限的,但因为整个过程其实就是新手向,用于平时练练手,正式环境也不会用原生的服务安装这里简单起见就没管了。电脑资源充足(多核、32内存,16也行,但同时只能起一两个服务测试)可以去找HDP和CDH的资料练习。