搭建Spark Standalone集群

576 阅读1分钟

本文已参与「新人创作礼」活动,一起开启掘金创作之路。

此文以Spark 3.1.2版本为例!

如未指定,下述命令在所有节点执行!

一、系统资源及组件规划

节点名称系统名称CPU/ 内存网卡磁盘IP 地址OS
Mastermaster2C/4Gens33128G192.168.0.10CentOS7
Worker1worker12C/4Gens33128G192.168.0.11CentOS7
Worker2worker22C/4Gens33128G192.168.0.12CentOS7

二、系统软件安装与设置

1、安装基本软件

yum -y install vim lrzsz bash-completion

image.png

2、设置名称解析

echo 192.168.0.10 master >> /etc/hosts
echo 192.168.0.11 worker1 >> /etc/hosts
echo 192.168.0.12 worker2 >> /etc/hosts

image.png

3、设置NTP

yum -y install chrony

image.png

systemctl start chronyd
systemctl enable chronyd
systemctl status chronyd

image.png

chronyc sources

image.png

4、设置SELinux、防火墙

systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

image.png

三、搭建Spark Standalone集群

1、设置SSH免密登录

在Master节点上配置免密ssh所有节点:

ssh-keygen -t rsa

image.png

for host in master worker1 worker2; do ssh-copy-id -i ~/.ssh/id_rsa.pub $host; done

image.png

2、安装JDK

下载JDK文件:

参考地址:www.oracle.com/java/techno…

 

解压JDK安装文件:

tar -xf /root/jdk-8u291-linux-x64.tar.gz -C /usr/local/

image.png

设置环境变量:

export JAVA_HOME=/usr/local/jdk1.8.0_291/
export PATH=$PATH:/usr/local/jdk1.8.0_291/bin/

image.png

添加环境变量至/etc/profile文件:

export JAVA_HOME=/usr/local/jdk1.8.0_291/
PATH=$PATH:/usr/local/jdk1.8.0_291/bin/

image.png

查看Java版本:

java -version

image.png

3、安装Spark Standalone集群

下载Spark文件:

参考地址:spark.apache.org/downloads.h…

 

解压Spark安装文件:

tar -zxf /root/spark-3.1.2-bin-hadoop3.2.tgz -C /usr/local/

image.png

设置环境变量:

export PATH=$PATH:/usr/local/spark-3.1.2-bin-hadoop3.2/bin/:/usr/local/spark-3.1.2-bin-hadoop3.2/sbin/

image.png

添加环境变量至/etc/profile文件:

PATH=$PATH:/usr/local/spark-3.1.2-bin-hadoop3.2/bin/:/usr/local/spark-3.1.2-bin-hadoop3.2/sbin/

image.png

4、配置Spark Standalone集群

创建spark-env.sh文件:

cat > /usr/local/spark-3.1.2-bin-hadoop3.2/conf/spark-env.sh << EOF
export JAVA_HOME=/usr/local/jdk1.8.0_291/
SPARK_MASTER_HOST=master
SPARK_MASTER_PORT=7077
EOF

image.png

创建workers文件,指定Worker节点:

cat > /usr/local/spark-3.1.2-bin-hadoop3.2/conf/workers << EOF
worker1
worker2
EOF

image.png

5、启动Spark Standalone集群

方式一:

在Master节点上启动Spark集群:

start-all.sh

image.png

方法二:

在Master节点上启动Spark Master节点:

start-master.sh

image.png

在Worker节点上启动Spark Worker节点:

start-worker.sh spark://master:7077

image.png

在各类节点上查看Spark进程:

jps

image.png

image.png

image.png

 

6、登录Spark Standalone集群

登录Master:

http://192.168.0.10:8080

image.png

登录Worker:

http://192.168.0.11:8081

image.png

7、停止Spark Standalone集群

方式一:

在Master节点上停止Spark集群:

stop-all.sh

image.png

方法二:

在Worker节点上停止Spark Worker节点:

stop-worker.sh

image.png

在Master节点上停止Spark Master节点:

stop-master.sh

image.png