阅读 278

Centos7安装Hadoop环境

Centos7安装Hadoop环境记录

环境介绍

  • 虚机数量:3
  • 操作系统版本:CentOS-7-x86_64-Minimal-2009.iso

集群介绍

软件版本介绍

  • JDK版本:jdk-8u281-linux-x64.tar.gz
  • hadoop版本:hadoop-3.2.2.tar.gz
  • zookeeper版本
  • hbase版本
  • Storm版本
  • kafka版本
  • MySQL版本:mysql-8.0.22-linux-glibc2.12-x86_64
  • hive版本:apache-hive-3.1.2
  • Flume版本
  • Spark版本

前期准备

每台主机节点都进行相同设置

1.设置hostname

在这里设置为master,命令语法为:

[root@192 ~]sudo hostnamectl set-hostname master
[root@master ~]vi /etc/hosts
#添加以下配置
192.168.11.212 master
复制代码

2. 设置用户

新建用户hadoop

千万注意:不要在root权限下配置集群

  • 新建 hadoop用户组
[root@master ~]groupadd hadoop
复制代码
  • 新建用户 hadoop,并将该用户添加到用户组 hadoop
[root@master ~]useradd hadoop -g hadoop
复制代码
  • 为 hadoop用户设置密码
[root@master ~]passwd hadoop
复制代码

添加sudo权限

  • 切换到root用户,修改 /etc/sudoers 文件
[root@master ~]vi /etc/sudoers

添加如下语句:
##Allow root to run any commands anywhere
rootALL=(ALL)	ALL
hadoop  ALL=(ALL)	ALL
复制代码

3. 安装Java并设置相应的环境变量

***Hadoop3.0之后的版本只支持Java8以后的版本。***下载完jdk解压之后放置于/soft(目录可以更改),下载完之后在/etc/profile中配置相关的环境变量。

安装JDK

  • 准备JDK:jdk-8u281-linux-x64.tar.gz,将其上传到主机/home/hadoop 目录
  • 新建/soft 文件夹,并将该文件夹的用户组权限和用户权限改为hadoop,该文件夹下为所有需要安装的软件
//创建soft文件夹
[hadoop@master /]$ sudo mkdir /soft
//修改权限
[hadoop@master /]$ sudo chown hadoop:hadoop /soft
复制代码
  • 解压jdk-8u281-linux-x64.tar.gz到/soft 目录下,并创建符号链接
// 从 /home/hadoop 下解压到 /soft
[hadoop@master ~]$ tar -xzvf jdk-8u281-linux-x64.tar.gz -C /soft
复制代码
  • 在 /etc/profile 文件中配置环境变量,运行source /etc/profile,使其立即生效
// 进入profile
[hadoop@master ~]$ sudo vi /etc/profile

// 环境变量

#jdk

export JAVA_HOME=/soft/jdk1.8.0_281
export JRE_HOME=/soft/jdk1.8.0_281/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin

// source 立即生效
[hadoop@master ~]$ source /etc/profile
复制代码
  • 检验是否安装配置成功
[hadoop@master ~]$ java -version

// 显示如下
java version "1.8.0_281"
Java(TM) SE Runtime Environment (build 1.8.0_281-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.281-b09, mixed mode)
复制代码

在配置ssh免密登陆之前,将master克隆3份slaves出来,然后验证其ip是否和上面所述一致,并使用Xshell连接,这样我们可以得到额外的三台机器,且都安装好Java的。

4.SSH 免密登陆

配置sshd

  • 修改sshd配置文件
[root@master ~]vi /etc/ssh/sshd_config
#去掉以下3行的 “#” 注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile   .ssh/authorized_keys
复制代码
  • 生成密钥
[root@master ~]su - hadoop
[hadoop@master ~]ssh-keygen -t rsa
复制代码

无需指定口令密码,直接回车,命令执行完毕后会在 hadoop用户的家目录中(/home/hadoop/.ssh)生成两个文件:

id_rsa: 私钥 id_rsa.pub:公钥

  • 将公钥导入到认证文件
[hadoop@master ~]cat /home/hadoop/.ssh/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys
复制代码
  • 设置文件访问权限
[hadoop@master ~]chmod 700 /home/hadoop/.ssh
[hadoop@master ~]chmod 600 /home/hadoop/.ssh/authorized_keys
复制代码

5.安装hadoop 3.2

安装并配置环境变量

  • 下载hadoop-3.2.2.tar.gz,将其上传到主机/home/hadoop 目录
  • 解压hadoop-3.2.2.tar.gz到/soft 目录下,并创建符号链接
// 从 /home/hadoop 下解压到 /soft
[hadoop@master ~]$ tar -xzvf hadoop-3.2.2.tar.gz -C /soft
复制代码
  • 在 /etc/profile 文件中最后添加以下两行,运行source /etc/profile,使其立即生效
// 进入profile
[hadoop@master ~]$ sudo vi /etc/profile

export HADOOP_HOME=/soft/hadoop-3.2.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

// source 立即生效
[hadoop@master ~]$ source /etc/profile
复制代码
  • 检验是否安装配置成功
[hadoop@master ~]hadoop version
// 显示
Hadoop 3.2.2
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842
Compiled by rohithsharmaks on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 776eaf9eee9c0ffc370bcbc1888737
This command was run using /soft/hadoop-3.2.2/share/hadoop/common/hadoop-common-3.2.2.jar
复制代码
  • 在hadoop安装目录下设置相应的数据目录

这些数据目录可以自己设置,只需要在后续的配置中对对应的目录指定就可以。

在***/soft/hadoop下新建文件夹tmp***来做我们的临时目录。

[hadoop@master ~]mkdir -p /soft/hadoop/tmp        #临时目录,存储临时文件
[hadoop@master ~]mkdir -p /soft/hadoop/hdfs/nn    #namenode目录
[hadoop@master ~]mkdir -p /soft/hadoop/hdfs/dn    #datanode目录
[hadoop@master ~]mkdir -p /soft/hadoop/yarn/nm    #nodemanager目录
复制代码
  • 配置相关配置文件,在hadoop-3.2.2./etc/hadoop目录下。
文件介绍
core-site.xml核心配置文件
dfs-site.xmlhdfs存储相关配置
apred-site.xmlMapReduce相关的配置
arn-site.xmlyarn相关的一些配置
workers用来指定从节点,文件中默认是localhost
hadoop-env.sh配置hadoop相关变量

修改配置文件

  • 修改 core-site.xml

输入:

[hadoop@master ~]vi core-site.xml
复制代码

在添加:

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:9000</value>
    </property>  
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/soft/hadoop/tmp</value>
    </property>  
    <property>
      <name>hadoop.proxyuser.hadoop.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hadoop.groups</name>
     <value>hadoop</value>
    </property>
</configuration>
复制代码
  • 修改 hadoop-env.sh

输入

[hadoop@master ~]vi hadoop-env.sh
复制代码

${JAVA_HOME} 修改为自己的JDK路径

export   JAVA_HOME=${JAVA_HOME}
复制代码

修改为:

export   JAVA_HOME=/soft/jdk1.8.0_281
复制代码
  • 修改 hdfs-site.xml

输入:

[hadoop@master ~]vi hdfs-site.xml
复制代码

在添加:

<property>
   <name>dfs.name.dir</name>
   <value>/soft/hadoop/hdfs/nn</value>
   <description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
   <name>dfs.data.dir</name>
   <value>/soft/hadoop/hdfs/dn</value>
   <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
   <name>dfs.replication</name>
   <value>2</value>
</property>
<property>
	  <name>dfs.permissions</name>
	  <value>true</value>
	  <description>need permissions</description>
</property>
<property>
    <name>dfs.http.address</name>
    <value>0.0.0.0:50070</value>
</property>
复制代码

说明:dfs.permissions配置为false后,可以允许不要检查权限就生成dfs上的文件,方便倒是方便了,但是你需要防止误删除,请将它设置为true,或者直接将该property节点删除,因为默认就是true。

  • 修改mapred-site.xml

如果没有 mapred-site.xml 该文件,就复制mapred-site.xml.template文件并重命名为mapred-site.xml。 输入:

[hadoop@master ~]vi mapred-site.xml
复制代码

修改这个新建的mapred-site.xml文件,在<configuration>节点内加入配置:

<property>
	<name>mapred.job.tracker</name>
	<value>master:9001</value>
</property>
<property>  
    <name>mapreduce.jobhistory.address</name>  
    <value>master:10020</value>  
</property>
<property>
    <name>mapred.local.dir</name>
    <value>/soft/hadoop/yarn</value>
</property>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
复制代码
  • 修改mapred-site.xml
[hadoop@master ~]vi mapred-site.xml
复制代码

<configuration>节点内加入配置:

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
	<property>
	<name>yarn.nodemanager.vmem-check-enabled</name>
		<value>false</value>
		<description>Whether virtual memory limits will be enforced for containers</description>
	</property>
复制代码

Hadoop启动

第一次启动Hadoop需要初始化 切换到/soft/hadoop-3.2.2/bin目录下输入

[hadoop@master ~]./hadoop namenode -format
复制代码

image-20210304153750623

切换到/soft/hadoop-3.2.2/sbin目录,启动

[hadoop@master sbin]$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [master]
Starting datanodes
Starting secondary namenodes [master]
2021-03-05 11:47:40,324 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
复制代码

在浏览器,输入 jps 查看是否成功启动

http://192.168.11.212:8088/cluster

image-20210305140336277

http://192.168.11.212:50070

image-20210305140410630

6.安装Mysql

  • 上传并解压mysql压缩包,位置 /soft
[hadoop@master soft]$ tar -xvf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xz
复制代码
  • 解压完成,删除压缩包
[hadoop@master soft]$ rm -rf mysql-8.0.22-linux-glibc2.12-x86_64.tar.xz
复制代码
  • 更改文件夹名称
[hadoop@master soft]$ mv mysql-8.0.22-linux-glibc2.12-x86_64/ mysql-8.0.22/
复制代码
  • 在mysql-8.0.22文件夹下创建data目录
[hadoop@master soft]$ mkdir /soft/mysql-8.0.22/data
复制代码
  • 更改mysql目录权限
[hadoop@master soft]$ chmod -R 755 /soft/mysql-8.0.22/
复制代码
  • 编译安装并初始化mysql

务必记住初始化输出日志末尾的密码(数据库管理员临时密码)

[hadoop@master soft]$ cd /soft/mysql-8.0.22/bin/
[hadoop@master bin]$ ./mysqld --initialize --user=mysql --datadir=/soft/mysql-8.0.22/data --basedir=/soft/mysql-8.0.22
#输出结果
2021-03-05T07:10:12.910569Z 0 [Warning] [MY-010139] [Server] Changed limits: max_open_files: 1024 (requested 8161)
2021-03-05T07:10:12.910610Z 0 [Warning] [MY-010142] [Server] Changed limits: table_open_cache: 431 (requested 4000)
2021-03-05T07:10:12.911947Z 0 [Warning] [MY-011070] [Server] 'Disabling symbolic links using --skip-symbolic-links (or equivalent) is the default. Consider not using this option as it' is deprecated and will be removed in a future release.
2021-03-05T07:10:12.912379Z 0 [System] [MY-013169] [Server] /soft/mysql-8.0.22/bin/mysqld (mysqld 8.0.22) initializing of server in progress as process 20165
2021-03-05T07:10:12.937653Z 0 [Warning] [MY-010122] [Server] One can only use the --user switch if running as root
2021-03-05T07:10:13.073996Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-03-05T07:10:16.457296Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2021-03-05T07:10:24.775289Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: jai;A5_I-xyu
复制代码
  • 编辑配置文件my.cnf
[root@master /]# vi /etc/my.cnf
#添加以下配置
datadir=/soft/mysql-8.0.22/data
basedir=/soft/mysql-8.0.22
port=3306
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
symbolic-links=0
max_connections=600
innodb_file_per_table=1
lower_case_table_names=0
character_set_server=utf8
//修改root密码时需打开,修改完成后,要立即注释
#skip-grant-tables
default_authentication_plugin=mysql_native_password
复制代码
  • 测试启动mysql服务器
[hadoop@master support-files]$ /soft/mysql-8.0.22/support-files/mysql.server start
Starting MySQL...... SUCCESS! 
复制代码
  • 添加软连接,并重启mysql服务
[root@master /]# ln -s /soft/mysql-8.0.22/support-files/mysql.server /etc/init.d/mysql
[root@master /]# ln -s /soft/mysql-8.0.22/bin/mysql /usr/bin/mysql
[hadoop@master mysql-8.0.22]$ service mysql restart
Shutting down MySQL.. SUCCESS! 
Starting MySQL.... SUCCESS! 
复制代码
  • 登录mysql,修改密码
[hadoop@master mysql-8.0.22]$ mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.22 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> alter user 'root'@'master' identified with mysql_native_password by '123456QWEasd';
Query OK, 0 rows affected (0.00 sec)
mysql> select authentication_string from user where user = 'root';
+-------------------------------------------+
| authentication_string                     |
+-------------------------------------------+
| *C4FE36EE5830F8BBC49315A96EEADF30D7292EBE |
+-------------------------------------------+
1 row in set (0.00 sec)
mysql> update user set user.Host='%' where user.User='root';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> quit;
复制代码

7.Hive环境安装和配置

配置环境变量

  • 上传并解压hive压缩包,位置 /soft
[hadoop@master soft]$ tar -xvf apache-hive-3.1.2-bin.tar.gz 
复制代码
  • 环境配置
[root@master etc]# vi /etc/profile
#添加以下配置
export HIVE_HOME=/soft/apache-hive-3.1.2-bin
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export PATH=.:${HIVE_HOME}/bin:$PATH
复制代码
  • 使配置生效,输入:
[root@master etc]# source /etc/profile
复制代码

配置更改

  • 新建文件夹
[hadoop@master /]$ mkdir /soft/hive
[hadoop@master /]$ mkdir /soft/hive/warehouse
[hadoop@master /]$ cd /soft/
[hadoop@master soft]$ ls -l
total 0
drwxrwxr-x. 10 hadoop hadoop 184 Mar  5 16:33 apache-hive-3.1.2-bin
drwxr-x---.  5 hadoop hadoop  41 Mar  4 15:07 hadoop
drwxr-xr-x. 10 hadoop hadoop 161 Mar  4 15:36 hadoop-3.2.2
drwxrwxr-x.  3 hadoop hadoop  23 Mar  5 16:46 hive
drwxr-xr-x.  8 hadoop hadoop 273 Dec  9 20:50 jdk1.8.0_281
drwxr-xr-x. 10 hadoop hadoop 141 Mar  5 15:02 mysql-8.0.22
复制代码

新建完该文件之后,需要让hadoop新建/soft/hive/warehouse 和 /soft/hive/ 目录。 执行命令:

$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/
$HADOOP_HOME/bin/hadoop fs -mkdir -p /soft/hive/warehouse
复制代码

给刚才新建的目录赋予读写权限,执行命令:

$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/
$HADOOP_HOME/bin/hadoop fs -chmod 777 /soft/hive/warehouse 
复制代码

检查这两个目录是否成功创建 输入:

[hadoop@master soft]$ $HADOOP_HOME/bin/hadoop fs -ls /soft/
2021-03-05 16:49:06,480 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxrwx   - hadoop supergroup          0 2021-03-05 16:48 /soft/hive
[hadoop@master soft]$ $HADOOP_HOME/bin/hadoop fs -ls /soft/hive/
2021-03-05 16:49:24,664 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxrwx   - hadoop supergroup          0 2021-03-05 16:48 /soft/hive/warehouse
复制代码

修改hive-site.xml

[hadoop@master soft]$ cd /soft/apache-hive-3.1.2-bin/conf/
[hadoop@master conf]$ cp hive-default.xml.template hive-site.xml
[hadoop@master conf]$ vi hive-site.xml
#修改配置文件参数
<!-- 指定HDFS中的hive仓库地址 --> 
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/soft/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/soft/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>
<!-- 指定mysql的连接 -->
 <property>
        <name>javax.jdo.option.ConnectionURL</name>
        jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;serverTimezone=GMT%2B8&amp;useSSL=false
    </property>
<!-- 指定驱动类 -->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
   <!-- 指定用户名 -->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <!-- 指定密码 -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456QWEasd</value>
    </property>
    <property>
       <name>hive.metastore.schema.verification</name>
       <value>false</value>
        <description>
        </description>
     </property>
复制代码

然后将配置文件中所有的system:java.io.tmpdir更改为/soft/hive/tmp(如果没有该文件则创建),并将此文件夹赋予读写权限,将{system:java.io.tmpdir}更改为 /soft/hive/tmp (如果没有该文件则创建), 并将此文件夹赋予读写权限,将{system:user.name} 更改为 root

修改 hive-env.sh

[hadoop@master conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@master conf]$ vi hive-env.sh
#添加以下配置
export HADOOP_HOME=/soft/hadoop-3.2.2
export HIVE_CONF_DIR=/soft/apache-hive-3.1.2-bin/conf
export HIVE_AUX_JARS_PATH=/soft/apache-hive-3.1.2-bin/lib
复制代码

添加 数据驱动包

由于Hive 默认自带的数据库是使用mysql,所以这块就是用mysql 将mysql 的驱动包上传到 /soft/apache-hive-3.1.2-bin/lib

8.Hive Shell 测试

切换到hive bin目录,前提保证hadoop和hive的两个guava.jar版本一致 两个位置分别位于下面两个目录: /soft/apache-hive-3.1.2-bin/lib /soft/hadoop-3.2.2/share/hadoop/common/lib

解决办法:删除低版本的那个,将高版本的复制到低版本目录下

[hadoop@master bin]$ schematool  -initSchema -dbType mysql
复制代码

启动hive

[hadoop@master sbin]$ cd /soft/apache-hive-3.1.2-bin/bin
[hadoop@master bin]$ hiveserver2
复制代码

9.Hadoop启动及停止

启动Hadoop服务

1、启动hadoop

[hadoop@master bin]$ cd /soft/hadoop-3.2.2/sbin/
[hadoop@master sbin]$ start-all.sh 
#检查hadoop是否启动命令(jps命令结果必须全部启动)
[hadoop@master sbin]$ cd /soft/hadoop-3.2.2/bin/
[hadoop@master bin]$ jps
68035 Jps
63651 NameNode
67426 RunJar
63764 DataNode
63972 SecondaryNameNode
64197 ResourceManager
64309 NodeManager
64678 RunJar
复制代码

2、启动mysql

[hadoop@master bin]$ service mysql start
#重启命令
[hadoop@master bin]$ service mysql restart
复制代码

3、启动hive

[hadoop@master bin]$ cd /soft/apache-hive-3.1.2-bin/bin
#hiveShell启动命令
[hadoop@master bin]$ hive
#JDBC连接启动命令(命令执行完成后,直接关闭XShell窗口,重新开启新的窗口)
[hadoop@master bin]$ nohup hiveserver2 &
复制代码

停止Hadoop服务

1、停止hive

[hadoop@master hadoop]# ps -aux| grep hiveserver2
[hadoop@master hadoop]# kill -9 <PID>
复制代码

2、停止mysql

[hadoop@master hadoop]# service mysql stop
Shutting down MySQL........... SUCCESS! 
复制代码

3、停止hadoop

[hadoop@master hadoop]# cd /soft/hadoop-3.2.2/sbin/
[hadoop@master sbin]# stop-all.sh 
[hadoop@master sbin]$ ../bin/jps
70353 Jps
复制代码

10.配置信息

Hadoop相关网页:

http://192.168.11.212:8088/cluster //hadoop监控

http://192.168.11.212:50070/ //namenode信息

Hive相关网页:

http://192.168.11.212:10002/ //hiveserver2

数据库:

数据库名端口号账户密码
mysqlhive3306root123456QWEasd
hivedb_hiveTest10000hadoophadoop

备注:jdbc连接hive链接:jdbc:hive2://192.168.11.212:10000/db_hiveTest

文章分类
后端
文章标签