快速部署Hadoop集群

270 阅读4分钟

1. 修改Linux主机名

hostnamectl set-hostname dhf1

或修改配置文件

vim /etc/sysconfig/network 

NETWORKING=yes
HOSTNAME=dhf1

2. 修改IP

vim /etc/sysconfig/network-scripts/ifcfg-eth0

systemctl restart network

3. 修改主机名和IP的映射关系

vim /etc/hosts

192.xxx.xxx.227 dhf1
192.xxx.xxx.228 dhf2
192.xxx.xxx.229 dhf3
192.xxx.xxx.230 dhf4
192.xxx.xxx.231 dhf5
192.xxx.xxx.232 dhf6
192.xxx.xxx.233 dhf7

4.关闭防火墙

systemctl status firewalld

systemctl stop firewalld

systemctl disable firewalld

5.ssh免登陆

ssh-keygen -t rsa (四个回车)

执行完这个命令后,会生成两个文件id_rsa(私钥)、id_rsa.pub(公钥)将公钥拷贝到要免登陆的机器上(包括本机器):

ssh-copy-id dhf1
需要生成公钥的机器需要拷贝到的机器
dhf1dhf1、dhf2、dhf3、dhf4、dhf5、dhf6、dhf7
dhf2dhf1、dhf2
dhf3dhf3、dhf4、dhf5、dhf6、dhf7

6. 安装JDK,配置环境变量等

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.272.b10-1.el7_9.x86_64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

source /etc/profile

7. 重启机器

Reboot

8.集群规划

主机名IP安装的软件运行的进程
dhf1192.xxx.xxx.227jdk、hadoopNameNode、DFSZKFailoverController(zkfc)
dhf2192.xxx.xxx.228jdk、hadoopNameNode、DFSZKFailoverController(zkfc)
dhf3192.xxx.xxx.229jdk、hadoopResourceManager
dhf4192.xxx.xxx.230jdk、hadoopResourceManager
dhf5192.xxx.xxx.231jdk、hadoop、zookeeperDataNode、NodeManager、JournalNode、QuorumPeerMain
dhf6192.xxx.xxx.232jdk、hadoop、zookeeperDataNode、NodeManager、JournalNode、QuorumPeerMain
dhf7192.xxx.xxx.233jdk、hadoop、zookeeperDataNode、NodeManager、JournalNode、QuorumPeerMain

说明:在hadoop2.0中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。hadoop官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM。这里我们使用简单的QJM。在该方案中,主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode。这里还配置了一个zookeeper集群,用于ZKFC(DFSZKFailoverController)故障转移,当Active NameNode挂掉了,会自动切换Standby NameNode为standby状态。两个ResourceManager,一个是Active,一个是Standby,状态由zookeeper进行协调。把namenode和resourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动。

9.安装zookeeper

9.1.安装配置zooekeeper集群

(在dhf5上操作)

cd /cdc/apache-zookeeper-3.5.8-bin/conf/

cp zoo_sample.cfg zoo.cfg

修改zoo.cfg

vim zoo.cfg

dataDir=/cdc/apache-zookeeper-3.5.8-bin/tmp

server.1=dhf5:2888:3888
server.2=dhf6:2888:3888
server.3=dhf7:2888:3888

保存退出

然后创建一个tmp文件夹

mkdir /cdc/apache-zookeeper-3.5.8-bin/tmp

再创建一个空文件

touch /cdc/apache-zookeeper-3.5.8-bin/tmp/myid

最后向该文件写入ID

echo 1 > /cdc/apache-zookeeper-3.5.8-bin/tmp/myid

9.2将配置好的zookeeper拷贝到其他节点

scp -r /cdc/apache-zookeeper-3.5.8-bin/ dhf6:/cdc/

scp -r /cdc/apache-zookeeper-3.5.8-bin/ dhf7:/cdc/

注意:修改dhf6、dhf7对应/cdc/apache-zookeeper-3.5.8-bin/tmp/myid内容

dhf6:echo 2 > /cdc/apache-zookeeper-3.5.8-bin/tmp/myid

dhf7:echo 3 > /cdc/apache-zookeeper-3.5.8-bin/tmp/myid

10.安装hadoop

10.1安装配置hadoop集群

(在dhf1上操作)

10.1.1将hadoop添加到环境变量中

vim /etc/profile

export HADOOP_HOME=/cdc/hadoop-3.3.0
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root

10.1.2配置HDFS

(hadoop所有的配置文件都在$HADOOP_HOME/etc/hadoop目录下)

首先通过hadoop classpath命令获取HADOOP_CLASSPATH,如下:

/cdc/hadoop-3.3.0/etc/hadoop:/cdc/hadoop-3.3.0/share/hadoop/common/lib/*:/cdc/hadoop-3.3.0/share/hadoop/common/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs:/cdc/hadoop-3.3.0/share/hadoop/hdfs/lib/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs/*:/cdc/hadoop-3.3.0/share/hadoop/mapreduce/*:/cdc/hadoop-3.3.0/share/hadoop/yarn:/cdc/hadoop-3.3.0/share/hadoop/yarn/lib/*:/cdc/hadoop-3.3.0/share/hadoop/yarn/*
10.1.2.1修改hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.272.b10-1.el7_9.x86_64

export HADOOP_CLASSPATH=/cdc/hadoop-3.3.0/etc/hadoop:/cdc/hadoop-3.3.0/share/hadoop/common/lib/*:/cdc/hadoop-3.3.0/share/hadoop/common/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs:/cdc/hadoop-3.3.0/share/hadoop/hdfs/lib/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs/*:/cdc/hadoop-3.3.0/share/hadoop/mapreduce/*:/cdc/hadoop-3.3.0/share/hadoop/yarn:/cdc/hadoop-3.3.0/share/hadoop/yarn/lib/*:/cdc/hadoop-3.3.0/share/hadoop/yarn/*
10.1.2.2修改core-site.xml
<configuration>
	<!-- 指定hdfs的nameservice为ns1 -->
	<property>
        <name>fs.defaultFS</name>
        <value>hdfs://ns1</value>
	</property>
    <!-- 指定hadoop临时目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/cdc/hadoop-3.3.0/tmp</value>
    </property>
    <!-- 指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>dhf5:2181,dhf6:2181,dhf7:2181</value>
    </property>
	<property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
	</property>
	<property>
    	<name>hadoop.proxyuser.root.groups</name>
    	<value>*</value>
	</property>
</configuration>
10.1.2.3修改hdfs-site.xml
<configuration>
	<!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->
	<property>
        <name>dfs.nameservices</name>
        <value>ns1</value>
    </property>
    <!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
	<property>
        <name>dfs.ha.namenodes.ns1</name>
        <value>nn1,nn2</value>
	</property>
	<!-- nn1的RPC通信地址 -->
	<property>
        <name>dfs.namenode.rpc-address.ns1.nn1</name>
        <value>dhf1:9000</value>
    </property>
    <!-- nn1的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns1.nn1</name>
        <value>dhf1:50070</value>
    </property>
    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns1.nn2</name>
        <value>dhf2:9000</value>
    </property>
    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns1.nn2</name>
        <value>dhf2:50070</value>
    </property>
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://dhf5:8485;dhf6:8485;dhf7:8485/ns1</value>
    </property>
    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/cdc/hadoop-3.3.0/journal</value>
    </property>
    <!-- 开启NameNode失败自动切换 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <!-- 配置失败自动切换实现方式 -->
    <property>
        <name>dfs.client.failover.proxy.provider.ns1</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
        sshfence
        shell(/bin/true)
        </value>
    </property>
    <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>
    <!-- 配置sshfence隔离机制超时时间 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
</configuration>
10.1.2.4修改mapred-site.xml
<configuration>
    <!-- 指定mr框架为yarn方式 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>yarn.app.mapreduce.am.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
        <name>mapreduce.map.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
        <name>OP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
</configuration>   
10.1.2.5修改yarn-site.xml
<configuration>
    <!-- 开启RM高可靠 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <!-- 指定RM的cluster id -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yrc</value>
    </property>
    <!-- 指定RM的名字 -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <!-- 分别指定RM的地址 -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>dhf3</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>dhf4</value>
    </property>
    <property> 
        <name>yarn.resourcemanager.webapp.address.rm1</name> 
        <value>dhf3:8088</value>
    </property> 
    <property> 
        <name>yarn.resourcemanager.webapp.address.rm2</name> 
        <value>dhf4:8088</value>
    </property>
    <!-- 指定zk集群地址 -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>dhf5:2181,dhf6:2181,dhf7:2181</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.application.classpath</name>
        <value>/cdc/hadoop-3.3.0/etc/hadoop:/cdc/hadoop-3.3.0/share/hadoop/common/lib/*:/cdc/hadoop-				3.3.0/share/hadoop/common/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs:/cdc/hadoop-								3.3.0/share/hadoop/hdfs/lib/*:/cdc/hadoop-3.3.0/share/hadoop/hdfs/*:/cdc/hadoop-							3.3.0/share/hadoop/mapreduce/*:/cdc/hadoop-3.3.0/share/hadoop/yarn:/cdc/hadoop-								3.3.0/share/hadoop/yarn/lib/*:/cdc/hadoop-3.3.0/share/hadoop/yarn/*
        </value>
    </property>   
</configuration>
10.1.2.6修改workers (workers)

(workers 是指定子节点的位置,因为要在dhf1上启动HDFS、在dhf3启动yarn,所以dhf1上的workers 文件指定的是datanode的位置,dhf3上的workers 文件指定的是nodemanager的位置)

vim workers 

dhf5
dhf6
dhf7

10.2将配置好的hadoop拷贝到其他节点

scp -r /cdc/hadoop-3.3.0/ root@dhf2:/cdc/
scp -r /cdc/hadoop-3.3.0/ root@dhf3:/cdc/
scp -r /cdc/hadoop-3.3.0/ root@dhf4:/cdc/
scp -r /cdc/hadoop-3.3.0/ root@dhf5:/cdc/
scp -r /cdc/hadoop-3.3.0/ root@dhf6:/cdc/
scp -r /cdc/hadoop-3.3.0/ root@dhf7:/cdc/

11.启动服务

11.1启动zookeeper集群

(分别在dhf5、dhf6、dhf7上启动zk)(QuorumPeerMain)

cd /cdc/apache-zookeeper-3.5.8-bin/bin/

./zkServer.sh start

查看状态:一个leader,两个follower

./zkServer.sh status

11.2启动journalnode

cd /cdc/hadoop-3.3.0/;rm -rf journal/ns1/;rm -rf logs/; rm -rf tmp/;

(分别在在dhf5、dhf6、tcast07上执行)

cd /cdc/hadoop-3.3.0/sbin/

./hadoop-daemon.sh start journalnode

运行jps命令检验,dhf5、dhf6、dhf7上多了JournalNode进程

11.3格式化HDFS

(在dhf1上执行命令)

hdfs namenode -format

格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/cdc/hadoop-3.3.0/tmp,然后将/cdc/hadoop-3.3.0/tmp拷贝到dhf2的/cdc/hadoop-3.3.0/下。

scp -r /cdc/hadoop-3.3.0/tmp/ root@dhf2:/cdc/hadoop-3.3.0/   

11.4格式化ZK

(在dhf1上执行)

hdfs zkfc -formatZK

11.5启动HDFS

(在dhf1上执行)

cd /cdc/hadoop-3.3.0/sbin/

./start-dfs.sh

11.6启动YARN

(在dhf3上执行)

cd /cdc/hadoop-3.3.0/sbin/

./start-yarn.sh

12.验证

192.xxx.xxx.228:50070

NameNode 'dhf2:9000' (active)

clip_image002.jpg

192.xxx.xxx.227:50070

NameNode 'dhf1:9000' (standby)

clip_image004.jpg

查看datanode节点状态全部上线

clip_image006.jpg

首先向hdfs上传一个文件

hadoop fs -mkdir /dhf

hadoop fs -put /test.txt /dhf

hadoop fs -ls /dhf

clip_image008.jpg

然后再kill掉active的NameNode(dhf2)

kill -9 16950

通过浏览器访问:192.xxx.xxx.227:50070

NameNode 'dhf1:9000' (active)

这个时候dhf1上的NameNode变成了active

hadoop fs -ls /dhf

刚才上传的文件依然存在

clip_image010.jpg

手动启动那个挂掉的NameNode

./hadoop-daemon.sh start namenode

通过浏览器访问:192.xxx.xxx.228:50070

NameNode 'dhf2:9000' (standby)

关注公众号,添加作者微信,一起讨论更多。

20210314182729784.png