Centos7安装Hadoop环境记录
环境介绍
- 虚机数量:3
- 操作系统版本:CentOS-7-x86_64-Minimal-2009.iso
集群介绍
软件版本介绍
- JDK版本:jdk-8u281-linux-x64.tar.gz
- hadoop版本:hadoop-3.2.2.tar.gz
- zookeeper版本:
- hbase版本:
- Storm版本:
- kafka版本:
- MySQL版本:mysql-8.0.22-linux-glibc2.12-x86_64
- hive版本:apache-hive-3.1.2
- Flume版本:
- Spark版本:
前期准备
每台主机节点都进行相同设置
1.设置hostname
在这里设置为master,命令语法为:
[root@192 ~]sudo hostnamectl set-hostname master
[root@master ~]vi /etc/hosts
#添加以下配置
192.168.11.212 master
2. 设置用户
新建用户hadoop
千万注意:不要在root权限下配置集群
- 新建 hadoop用户组
[root@master ~]groupadd hadoop
- 新建用户 hadoop,并将该用户添加到用户组 hadoop
[root@master ~]useradd hadoop -g hadoop
- 为 hadoop用户设置密码
[root@master ~]passwd hadoop
添加sudo权限
- 切换到root用户,修改 /etc/sudoers 文件
[root@master ~]vi /etc/sudoers
添加如下语句:
##Allow root to run any commands anywhere
rootALL=(ALL) ALL
hadoop ALL=(ALL) ALL
3. 安装Java并设置相应的环境变量
***Hadoop3.0之后的版本只支持Java8以后的版本。***下载完jdk解压之后放置于/soft(目录可以更改),下载完之后在/etc/profile中配置相关的环境变量。
安装JDK
- 准备JDK:jdk-8u281-linux-x64.tar.gz,将其上传到主机/home/hadoop 目录
- 新建/soft 文件夹,并将该文件夹的用户组权限和用户权限改为hadoop,该文件夹下为所有需要安装的软件
//创建soft文件夹
[hadoop@master /]$ sudo mkdir /soft
//修改权限
[hadoop@master /]$ sudo chown hadoop:hadoop /soft
- 解压jdk-8u281-linux-x64.tar.gz到/soft 目录下,并创建符号链接
// 从 /home/hadoop 下解压到 /soft
[hadoop@master ~]$ tar -xzvf jdk-8u281-linux-x64.tar.gz -C /soft
- 在 /etc/profile 文件中配置环境变量,运行source /etc/profile,使其立即生效
// 进入profile
[hadoop@master ~]$ sudo vi /etc/profile
// 环境变量
#jdk
export JAVA_HOME=/soft/jdk1.8.0_281
export JRE_HOME=/soft/jdk1.8.0_281/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
// source 立即生效
[hadoop@master ~]$ source /etc/profile
- 检验是否安装配置成功
[hadoop@master ~]$ java -version
// 显示如下
java version "1.8.0_281"
Java(TM) SE Runtime Environment (build 1.8.0_281-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.281-b09, mixed mode)
在配置ssh免密登陆之前,将master克隆3份slaves出来,然后验证其ip是否和上面所述一致,并使用Xshell连接,这样我们可以得到额外的三台机器,且都安装好Java的。
4.SSH 免密登陆
配置sshd
- 修改sshd配置文件
[root@master ~]vi /etc/ssh/sshd_config
#去掉以下3行的 “#” 注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
- 生成密钥
[root@master ~]su - hadoop
[hadoop@master ~]ssh-keygen -t rsa
无需指定口令密码,直接回车,命令执行完毕后会在 hadoop用户的家目录中(/home/hadoop/.ssh)生成两个文件:
id_rsa: 私钥 id_rsa.pub:公钥
- 将公钥导入到认证文件
[hadoop@master ~]cat /home/hadoop/.ssh/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys
- 设置文件访问权限
[hadoop@master ~]chmod 700 /home/hadoop/.ssh
[hadoop@master ~]chmod 600 /home/hadoop/.ssh/authorized_keys
5.安装hadoop 3.2
安装并配置环境变量
- 下载hadoop-3.2.2.tar.gz,将其上传到主机/home/hadoop 目录
- 解压hadoop-3.2.2.tar.gz到/soft 目录下,并创建符号链接
// 从 /home/hadoop 下解压到 /soft
[hadoop@master ~]$ tar -xzvf hadoop-3.2.2.tar.gz -C /soft
- 在 /etc/profile 文件中最后添加以下两行,运行source /etc/profile,使其立即生效
// 进入profile
[hadoop@master ~]$ sudo vi /etc/profile
export HADOOP_HOME=/soft/hadoop-3.2.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
// source 立即生效
[hadoop@master ~]$ source /etc/profile
- 检验是否安装配置成功
[hadoop@master ~]hadoop version
// 显示
Hadoop 3.2.2
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842
Compiled by rohithsharmaks on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 776eaf9eee9c0ffc370bcbc1888737
This command was run using /soft/hadoop-3.2.2/share/hadoop/common/hadoop-common-3.2.2.jar
- 在hadoop安装目录下设置相应的数据目录
这些数据目录可以自己设置,只需要在后续的配置中对对应的目录指定就可以。
在***/soft/hadoop下新建文件夹tmp***来做我们的临时目录。
[hadoop@master ~]mkdir -p /soft/hadoop/tmp #临时目录,存储临时文件
[hadoop@master ~]mkdir -p /soft/hadoop/hdfs/nn #namenode目录
[hadoop@master ~]mkdir -p /soft/hadoop/hdfs/dn #datanode目录
[hadoop@master ~]mkdir -p /soft/hadoop/yarn/nm #nodemanager目录
- 配置相关配置文件,在hadoop-3.2.2./etc/hadoop目录下。
| 文件 | 介绍 |
|---|---|
| core-site.xml | 核心配置文件 |
| dfs-site.xml | hdfs存储相关配置 |
| apred-site.xml | MapReduce相关的配置 |
| arn-site.xml | yarn相关的一些配置 |
| workers | 用来指定从节点,文件中默认是localhost |
| hadoop-env.sh | 配置hadoop相关变量 |
修改配置文件
- 修改 core-site.xml
输入:
[hadoop@master ~]vi core-site.xml
在添加:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/soft/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>hadoop</value>
</property>
</configuration>
- 修改 hadoop-env.sh
输入
[hadoop@master ~]vi hadoop-env.sh
将${JAVA_HOME} 修改为自己的JDK路径
export JAVA_HOME=${JAVA_HOME}
修改为:
export JAVA_HOME=/soft/jdk1.8.0_281
- 修改 hdfs-site.xml
输入:
[hadoop@master ~]vi hdfs-site.xml
在添加:
<property>
<name>dfs.name.dir</name>
<value>/soft/hadoop/hdfs/nn</value>
<description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/soft/hadoop/hdfs/dn</value>
<description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
<description>need permissions</description>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
</property>
说明:dfs.permissions配置为false后,可以允许不要检查权限就生成dfs上的文件,方便倒是方便了,但是你需要防止误删除,请将它设置为true,或者直接将该property节点删除,因为默认就是true。
- 修改mapred-site.xml
如果没有 mapred-site.xml 该文件,就复制mapred-site.xml.template文件并重命名为mapred-site.xml。
输入:
[hadoop@master ~]vi mapred-site.xml
修改这个新建的mapred-site.xml文件,在<configuration>节点内加入配置:
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/soft/hadoop/yarn</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
- 修改mapred-site.xml
[hadoop@master ~]vi mapred-site.xml
在<configuration>节点内加入配置:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
Hadoop启动
第一次启动Hadoop需要初始化 切换到/soft/hadoop-3.2.2/bin目录下输入
[hadoop@master ~]./hadoop namenode -format
切换到/soft/hadoop-3.2.2/sbin目录,启动
[hadoop@master sbin]$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [master]
Starting datanodes
Starting secondary namenodes [master]
2021-03-05 11:47:40,324 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
在浏览器,输入 jps 查看是否成功启动
http://192.168.11.212:8088/cluster
6.安装Mysql
- 上传并解压mysql压缩包,位置 /soft
[hadoop@master soft]$ tar -xvf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xz
- 解压完成,删除压缩包
[hadoop@master soft]$ rm -rf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xz
- 更改文件夹名称
[hadoop@master soft]$ mv mysql-8.0.22-linux-glibc2.12-x86_64/ mysql-8.0.22/
- 在mysql-8.0.22文件夹下创建data目录
[hadoop@master soft]$ mkdir /soft/mysql-8.0.22/data
- 更改mysql目录权限
[hadoop@master soft]$ chmod -R 755 /soft/mysql-8.0.22/
- 编译安装并初始化mysql
务必记住初始化输出日志末尾的密码(数据库管理员临时密码)
[hadoop@master soft]$ cd /soft/mysql-8.0.22/bin/
[hadoop@master bin]$ ./mysqld --initialize --user=mysql --datadir=/soft/mysql-8.0.22/data --basedir=/soft/mysql-8.0.22
#输出结果
2021-03-05T07:10:12.910569Z 0 [Warning] [MY-010139] [Server] Changed limits: max_open_files: 1024 (requested 8161)
2021-03-05T07:10:12.910610Z 0 [Warning] [MY-010142] [Server] Changed limits: table_open_cache: 431 (requested 4000)
2021-03-05T07:10:12.911947Z 0 [Warning] [MY-011070] [Server] 'Disabling symbolic links using --skip-symbolic-links (or equivalent) is the default. Consider not using this option as it' is deprecated and will be removed in a future release.
2021-03-05T07:10:12.912379Z 0 [System] [MY-013169] [Server] /soft/mysql-8.0.22/bin/mysqld (mysqld 8.0.22) initializing of server in progress as process 20165
2021-03-05T07:10:12.937653Z 0 [Warning] [MY-010122] [Server] One can only use the --user switch if running as root
2021-03-05T07:10:13.073996Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-03-05T07:10:16.457296Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2021-03-05T07:10:24.775289Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: jai;A5_I-xyu
- 编辑配置文件my.cnf
[root@master /]# vi /etc/my.cnf
#添加以下配置
datadir=/soft/mysql-8.0.22/data
basedir=/soft/mysql-8.0.22
port=3306
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
symbolic-links=0
max_connections=600
innodb_file_per_table=1
lower_case_table_names=0
character_set_server=utf8
//修改root密码时需打开,修改完成后,要立即注释
#skip-grant-tables
default_authentication_plugin=mysql_native_password
- 测试启动mysql服务器
[hadoop@master support-files]$ /soft/mysql-8.0.22/support-files/mysql.server start
Starting MySQL...... SUCCESS!
- 添加软连接,并重启mysql服务
[root@master /]# ln -s /soft/mysql-8.0.22/support-files/mysql.server /etc/init.d/mysql
[root@master /]# ln -s /soft/mysql-8.0.22/bin/mysql /usr/bin/mysql
[hadoop@master mysql-8.0.22]$ service mysql restart
Shutting down MySQL.. SUCCESS!
Starting MySQL.... SUCCESS!
- 登录mysql,修改密码
[hadoop@master mysql-8.0.22]$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.22 MySQL Community Server - GPL
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> alter user 'root'@'master' identified with mysql_native_password by '123456QWEasd';
Query OK, 0 rows affected (0.00 sec)
mysql> select authentication_string from user where user = 'root';
+-------------------------------------------+
| authentication_string |
+-------------------------------------------+
| *C4FE36EE5830F8BBC49315A96EEADF30D7292EBE |
+-------------------------------------------+
1 row in set (0.00 sec)
mysql> update user set user.Host='%' where user.User='root';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> quit;
7.Hive环境安装和配置
配置环境变量
- 上传并解压hive压缩包,位置 /soft
[hadoop@master soft]$ tar -xvf apache-hive-3.1.2-bin.tar.gz
- 环境配置
[root@master etc]# vi /etc/profile
#添加以下配置
export HIVE_HOME=/soft/apache-hive-3.1.2-bin
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export PATH=.:${HIVE_HOME}/bin:$PATH
- 使配置生效,输入:
[root@master etc]# source /etc/profile
配置更改
- 新建文件夹
[hadoop@master /]$ mkdir /soft/hive
[hadoop@master /]$ mkdir /soft/hive/warehouse
[hadoop@master /]$ cd /soft/
[hadoop@master soft]$ ls -l
total 0
drwxrwxr-x. 10 hadoop hadoop 184 Mar 5 16:33 apache-hive-3.1.2-bin
drwxr-x---. 5 hadoop hadoop 41 Mar 4 15:07 hadoop
drwxr-xr-x. 10 hadoop hadoop 161 Mar 4 15:36 hadoop-3.2.2
drwxrwxr-x. 3 hadoop hadoop 23 Mar 5 16:46 hive
drwxr-xr-x. 8 hadoop hadoop 273 Dec 9 20:50 jdk1.8.0_281
drwxr-xr-x. 10 hadoop hadoop 141 Mar 5 15:02 mysql-8.0.22
新建完该文件之后,需要让hadoop新建/soft/hive/warehouse 和 /soft/hive/ 目录。 执行命令:
$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/
$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/warehouse
给刚才新建的目录赋予读写权限,执行命令:
$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/
$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/warehouse
检查这两个目录是否成功创建 输入:
[hadoop@master soft]$ $HADOOP_HOME/bin/hadoop fs -ls /soft/
2021-03-05 16:49:06,480 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxrwx - hadoop supergroup 0 2021-03-05 16:48 /soft/hive
[hadoop@master soft]$ $HADOOP_HOME/bin/hadoop fs -ls /soft/hive/
2021-03-05 16:49:24,664 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxrwx - hadoop supergroup 0 2021-03-05 16:48 /soft/hive/warehouse
修改hive-site.xml
[hadoop@master soft]$ cd /soft/apache-hive-3.1.2-bin/conf/
[hadoop@master conf]$ cp hive-default.xml.template hive-site.xml
[hadoop@master conf]$ vi hive-site.xml
#修改配置文件参数
<!-- 指定HDFS中的hive仓库地址 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/soft/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/soft/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<!-- 指定mysql的连接 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&serverTimezone=GMT%2B8&useSSL=false
</property>
<!-- 指定驱动类 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- 指定用户名 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- 指定密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456QWEasd</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
</description>
</property>
然后将配置文件中所有的{system:user.name} 更改为 root
修改 hive-env.sh
[hadoop@master conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@master conf]$ vi hive-env.sh
#添加以下配置
export HADOOP_HOME=/soft/hadoop-3.2.2
export HIVE_CONF_DIR=/soft/apache-hive-3.1.2-bin/conf
export HIVE_AUX_JARS_PATH=/soft/apache-hive-3.1.2-bin/lib
添加 数据驱动包
由于Hive 默认自带的数据库是使用mysql,所以这块就是用mysql 将mysql 的驱动包上传到 /soft/apache-hive-3.1.2-bin/lib
8.Hive Shell 测试
切换到hive bin目录,前提保证hadoop和hive的两个guava.jar版本一致 两个位置分别位于下面两个目录: /soft/apache-hive-3.1.2-bin/lib /soft/hadoop-3.2.2/share/hadoop/common/lib
解决办法:删除低版本的那个,将高版本的复制到低版本目录下
[hadoop@master bin]$ schematool -initSchema -dbType mysql
启动hive
[hadoop@master sbin]$ cd /soft/apache-hive-3.1.2-bin/bin
[hadoop@master bin]$ hiveserver2
9.Hadoop启动及停止
启动Hadoop服务
1、启动hadoop
[hadoop@master bin]$ cd /soft/hadoop-3.2.2/sbin/
[hadoop@master sbin]$ start-all.sh
#检查hadoop是否启动命令(jps命令结果必须全部启动)
[hadoop@master sbin]$ cd /soft/hadoop-3.2.2/bin/
[hadoop@master bin]$ jps
68035 Jps
63651 NameNode
67426 RunJar
63764 DataNode
63972 SecondaryNameNode
64197 ResourceManager
64309 NodeManager
64678 RunJar
2、启动mysql
[hadoop@master bin]$ service mysql start
#重启命令
[hadoop@master bin]$ service mysql restart
3、启动hive
[hadoop@master bin]$ cd /soft/apache-hive-3.1.2-bin/bin
#hiveShell启动命令
[hadoop@master bin]$ hive
#JDBC连接启动命令(命令执行完成后,直接关闭XShell窗口,重新开启新的窗口)
[hadoop@master bin]$ nohup hiveserver2 &
停止Hadoop服务
1、停止hive
[hadoop@master hadoop]# ps -aux| grep hiveserver2
[hadoop@master hadoop]# kill -9 <PID>
2、停止mysql
[hadoop@master hadoop]# service mysql stop
Shutting down MySQL........... SUCCESS!
3、停止hadoop
[hadoop@master hadoop]# cd /soft/hadoop-3.2.2/sbin/
[hadoop@master sbin]# stop-all.sh
[hadoop@master sbin]$ ../bin/jps
70353 Jps
10.配置信息
Hadoop相关网页:
http://192.168.11.212:8088/cluster //hadoop监控
http://192.168.11.212:50070/ //namenode信息
Hive相关网页:
http://192.168.11.212:10002/ //hiveserver2
数据库:
| 数据库名 | 端口号 | 账户 | 密码 | |
|---|---|---|---|---|
| mysql | hive | 3306 | root | 123456QWEasd |
| hive | db_hiveTest | 10000 | hadoop | hadoop |
备注:jdbc连接hive链接:jdbc:hive2://192.168.11.212:10000/db_hiveTest