提示:linux系统默认python2.7 有些软件安装需要python3.x支持,此文介绍如何升级python到3.x
airflow api地址:https://airflow.apache.org/docs/apache-airflow/1.10.1/scheduler.html
@TOC
一、安装python环境
1、安装依赖
yum install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev curl libbz2-dev
2、下载安装包
#wget https://www.python.org/ftp/python/3.7.10/Python-3.7.10.tar.xz
wget https://zhengyansheng.oss-cn-beijing.aliyuncs.com/Python-3.7.10.tar.xz
tar xf Python-3.7.10.tar.xz
cd Python-3.7.10
3、编译&安装python源码包
这里一定要带上编译参数--with-ssl,该参数是编译python是加入SSL,如果没有在使用pip3时会报错SSL有问题
./configure --with-ssl
make
sudo make altinstall
4、将默认python替换成python3
unlink /usr/bin/python
ln -sv /usr/local/python37/bin/python3.7 /usr/bin/python
unlink /usr/bin/pip3
ln -sv /usr/local/bin/pip3.7 /usr/bin/pip3
5、修改python源
cat > /etc/pip.conf << EOF
[global]
trusted-host = mirrors.aliyun.com
index-url = http://mirrors.aliyun.com/pypi/simple/
[list]
format=columns
EOF
6、升级pip==20.2.4
pip3.7 install --upgrade pip==20.2.4
# 查看版本
python --version
pip3.7 --version
注意事项: 由于将OS系统默认的Python版本更改了,导致系统自带的命令行工具(yum/ urlgrabber-ext-down/ yum-config-manager)无法直接使用,需要做更改才行
1. vi /usr/bin/yum
2. vi /usr/libexec/urlgrabber-ext-down
3. vi /usr/bin/yum-config-manager
将头文件修改为原本的python2.x即可
#!/usr/bin/python2.7
二、安装mysql
1、安装
yum list installed | grep mysql
卸载
yum remove ....xxxx
wget http://repo.mysql.com/mysql57-community-release-el7-8.noarch.rpm
rpm -ivh mysql57-community-release-el7-8.noarch.rpm
安装成功后,会在/etc/yum.repos.d/目录下增加了以下两个文件 如下图
启动mysql
yum install mysql-server
查看mysql版本
mysql -V
2、启动数据库
# 1. Start mysql
service mysqld start
# 2. view mysql login password 查看初始密码 或者使用mysql -uroot -p登录 密码不需要输入
grep "password" /var/log/mysqld.log
echo explicit_defaults_for_timestamp=1 >> /etc/my.cnf
systemctl restart mysqld.service
3、创建数据库
mysql -uroot -p <xxx>
修改root密码
use mysql;
update user set password=password('123456') where user='root' and host='localhost';
grant all privileges on *.* to root@"%" identified by "123456";
flush privileges;
创建airflow数据库
CREATE DATABASE `airflow` /*!40100 DEFAULT CHARACTER SET utf8 */;
GRANT ALL ON airflow.* TO 'airflow_user'@'%';
FLUSH PRIVILEGES;
三、安装redis
1、安装
# 1. Install remi yum repo
yum install -y epel-release yum-utils
yum install -y http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum-config-manager --enable remi
# 2. Install redis latest version
yum install -y redis
2、配置
# vi /etc/redis.conf
bind 0.0.0.0
3、启动
# 1. Start redis
systemctl start redis && systemctl enable redis
systemctl status redis
# 2. View redis
ps -ef |grep redis
# 3. Test
redis-cli ping
# 4. View version
redis-cli --version
四、安装ariflow
1、安装
docker部署方式
https://github.com/airflow-cn/airflow-video/blob/master/3-deploy.md
# 1. Set env
export AIRFLOW_HOME=~/airflow
# 2. Install apache-airflow 2.1.0
AIRFLOW_VERSION=2.1.0
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
可以先下载CONSTRAINT_URL 超时的话去掉https--http
wget http://raw.githubusercontent.com/apache/airflow/constraints-2.1.0/constraints-3.7.txt
pip3.7 install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
或
pip3.7 install "apache-airflow==${AIRFLOW_VERSION}" --constraint constraints-3.7.txt
2、初始化数据库
# 1. Set up database
## https://airflow.apache.org/docs/apache-airflow/2.1.0/howto/set-up-database.html#
pip3.7 install pymysql
airflow config get-value core sql_alchemy_conn # 这一步报错,但是会创建文件/opt/module/airflow/airflow.cfg
# 2. Initialize the database
"""
# vi ~/airflow/airflow.cfg
[core]
sql_alchemy_conn = mysql+pymysql://airflow_user:123456@localhost:3306/airflow
airflow db init
1.报错
Global variable explicit_defaults_for_timestamp needs to be on (1) for mysql
解决
vim /usr/my.cnf
添加
[mysqld]
explicit_defaults_for_timestamp=1
2.报错
Native table 'performance_schema'.'session_variables' has the wrong structure"
运行命令
mysql_upgrade -u root -p --force
service mysqld restart
3、创建用户
# Create superuser
airflow users create \
--username admin \
--firstname l\
--lastname a \
--role Admin \
--email xxxx@qq.com
输入密码:123456
4、启动服务
后台启动加 airflow webserver --port 8080 -D
airflow webserver --port 8080
提示:黄色字表示未运行scheduler调度器
此时打开任务不会运行调度
运行调度器 此时打开任务 该任务每秒执行一次
airflow scheduler
5、分布式部署
5.1、安装依赖
pip install 'apache-airflow[celery]'
pip install celery[redis]
5.2、设置executor
[core]
# The executor class that airflow should use. Choices include
# ``SequentialExecutor``, ``LocalExecutor``, ``CeleryExecutor``, ``DaskExecutor``,
# ``KubernetesExecutor``, ``CeleryKubernetesExecutor`` or the
# full import path to the class when using a custom executor.
# executor = SequentialExecutor
executor = CeleryExecutor
[celery]
# broker_url = redis://redis:6379/0
broker_url = redis://hadoop102:6379/0
# result_backend = db+postgresql://postgres:airflow@postgres/airflow
result_backend = redis://hadoop102:6379/0
5.3、启动
# 1. Start webserver
airflow webserver -p 8000
# 2. Start scheduler
airflow scheduler
# 3. Start celery worker
airflow celery worker
# 4. Start celery flower
airflow celery flower
5.4、管理界面
Webserver
flower
5.5 演示
启动
查看scheduler日志
查看Worker日志