日常工作中有时需处理一个大型计算任务,这个任务可拆解成多个依赖计算资源的子任务组成,且可以并行执行。这类分布式计算任务可以抽象为如下模型:
其中,业务系统App
发布分布式作业Job
至任务队列Backend
(这个队列可以是存储、信息中间件等等),空闲的Worker
从队列中获取并执行还未执行的Job
,以达成执行分布式任务的效果。
而Apalis提供了类似的功能。Apalis将其自身定义为一个简单、可扩展的多线程作业和消息处理库;其有如下特点:
- 简单可预测的任务处理模型,内置并发、并行的工作流。
Worker
易于扩展,同时支持优雅关闭。- 支持通过
Redis
、Sqlite
、Postgres
、MySQL
等实现工作队列。
Apalis
同时还提供有Web接口及UI,能够可视化管理你的分布式任务。
Apalis
的工作机制类似于前文所说的模型:
sequenceDiagram
participant App
participant Worker
participant Backend
App->>+Backend: Add job to queue
Backend-->>+Worker: Job data
Worker->>+Backend: Update job status to 'Running'
Worker->>+App: Started job
loop job execution
Worker-->>-App: Report job progress
end
Worker->>+Backend: Update job status to 'completed'
当前Apalis
为0.7.1
版本。
示例
为项目中添加依赖,在Cargo.toml
中添加:
[dependencies]
apalis = { version = "0.7", features = "limit" } # Limit for concurrency
apalis-redis = { version = "0.7" } # Use redis for persistence
这里我们定义一个用于处理邮件的Worker
。其使用Redis
作为工作队列,内部存储Email
作为Job
的数据给Worker
消费。其中send_email()
是实际处理作业Job
的函数,当Worker
消费到一条未被执行的Email
数据,将触发send_email()
函数。
use apalis::prelude::*;
use apalis_redis::RedisStorage;
use serde::{Deserialize, Serialize};
#[derive(Debug, Deserialize, Serialize)]
struct Email {
to: String,
}
/// A function called for every job
async fn send_email(job: Email, data: Data<usize>) -> Result<(), Error> {
/// execute job
Ok(())
}
#[tokio::main]
async fn main() -> {
std::env::set_var("RUST_LOG", "debug");
env_logger::init();
let redis_url = std::env::var("REDIS_URL").expect("Missing env variable REDIS_URL");
let conn = apalis_redis::connect(redis_url).await.expect("Could not connect");
let storage = RedisStorage::new(conn);
WorkerBuilder::new("email-worker")
.concurrency(2)
.data(0usize)
.backend(storage)
.build_fn(send_email)
.run()
.await;
}
之后,我们还需要一个App
作为消费者来发布分布式任务,App
通常是在另一台服务器上的业务进程。通过produce_route_jobs()
发布Email
数据至Redis
工作队列中。
//This can be in another part of the program or another application eg a http server
async fn produce_route_jobs(storage: &mut RedisStorage<Email>) -> Result<()> {
storage
.push(Email {
to: "test@example.com".to_string(),
})
.await?;
}
这样,一个完整的示例程序就完成了。
BUG@0.7.1
博主在尝试使用apalis-mysql
时遇到了BUG,并且该BUG在当前一般使用apalis-mysql
时基本都会遇到。当前已经知会给开发者。
BUG Issue可以查看github.com/geofmureith… 。具体BUG的触发代码如下:
use std::time::Duration;
use anyhow::Result;
use apalis::layers::retry::backoff::ExponentialBackoffMaker;
use apalis::layers::retry::backoff::MakeBackoff;
use apalis::layers::retry::RetryPolicy;
use apalis::prelude::*;
use apalis_redis::RedisStorage;
use apalis_sql::mysql::MySqlPool;
use apalis_sql::mysql::MysqlStorage;
use email_service::{send_email, Email};
use tokio::signal::ctrl_c;
use tokio::time::sleep;
async fn produce_mysql_jobs(storage: &MysqlStorage<Email>) -> Result<()> {
let mut storage = storage.clone();
sleep(Duration::from_millis(100)).await;
storage
.push(Email {
to: format!("test@example.com"),
text: "Test background job from apalis".to_string(),
subject: "Background email job".to_string(),
})
.await?;
Ok(())
}
async fn mysql() -> Result<()> {
std::env::set_var("RUST_LOG", "debug,sqlx::query=error");
tracing_subscriber::fmt::init();
let database_url = std::env::var("DATABASE_URL")
.unwrap_or_else(|_| "mysql://root:strong_password@localhost:3306/apalis-jobs".to_string());
let pool = MySqlPool::connect(&database_url).await?;
// Setup migrations
MysqlStorage::setup(&pool).await?;
// Create a storage that consumes `Email`
let mysql: MysqlStorage<Email> = MysqlStorage::new(pool);
Monitor::new()
.register({
WorkerBuilder::new("tasty-avocado")
.concurrency(8)
.enable_tracing()
.backend(mysql)
.build_fn(send_email)
})
.run_with_signal(ctrl_c())
.await?;
Ok(())
}
async fn mysql_producer() -> Result<()> {
std::env::set_var("RUST_LOG", "debug,sqlx::query=error");
tracing_subscriber::fmt::init();
let database_url = std::env::var("DATABASE_URL")
.unwrap_or_else(|_| "mysql://root:strong_password@localhost:3306/apalis-jobs".to_string());
let pool = MySqlPool::connect(&database_url).await?;
// Setup migrations
MysqlStorage::setup(&pool).await?;
// Create a storage that consumes `Email`
let mysql: MysqlStorage<Email> = MysqlStorage::new(pool);
produce_mysql_jobs(&mysql).await?;
Ok(())
}
async fn redis() -> Result<()> {
std::env::set_var("RUST_LOG", "debug,sqlx::query=error");
tracing_subscriber::fmt::init();
let redis_url = std::env::var("REDIS_URL")
.unwrap_or_else(|_| "redis://localhost:6379".to_string());
let conn = apalis_redis::connect(redis_url)
.await
.expect("Could not connect");
let storage = RedisStorage::new(conn);
Monitor::new()
.register({
WorkerBuilder::new("email-worker-shadow")
.enable_tracing()
.concurrency(8)
.backend(storage)
.build_fn(send_email)
})
.run_with_signal(ctrl_c())
.await?;
Ok(())
}
async fn produce_redis_jobs(storage: &RedisStorage<Email>) -> Result<()> {
let mut storage = storage.clone();
sleep(Duration::from_millis(100)).await;
storage
.push(Email {
to: format!("test@example.com"),
text: "Test background job from apalis".to_string(),
subject: "Background email job".to_string(),
})
.await?;
Ok(())
}
async fn redis_producer() -> Result<()> {
std::env::set_var("RUST_LOG", "debug,sqlx::query=error");
tracing_subscriber::fmt::init();
let redis_url = std::env::var("REDIS_URL")
.unwrap_or_else(|_| "redis://localhost:6379".to_string());
let conn = apalis_redis::connect(redis_url)
.await
.expect("Could not connect");
let storage = RedisStorage::new(conn);
produce_redis_jobs(&storage).await?;
Ok(())
}
#[tokio::main]
async fn main() -> Result<()> {
mysql_producer().await
}
其中,redis()
、redis_producer()
为使用Redis
作为工作队列的分布式系统。mysql()
、mysql_producer()
为使用Mysql
作为工作队列的分布式系统。
使用apalis-redis说明预期行为
首先,我们可以通过apalis-redis
来观察预期的示例行为:
- 在进程1中执行
redis()
来启动一个Worker
订阅Email
。 - 在进程2中执行
redis_producer()
来生产一条Email
数据。
如下输出结果中attempt=1
(attempt
表示此次Job
被尝试执行的次数),正常的预期行为是,这条作业仅会被一个Worker
线程获取并执行。
Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.60s
Running `target/debug/example`
2025-05-07T08:16:02.549371Z DEBUG task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:16:04.550793Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Sending email to test@example.com, is_shutting_down false, count 1
2025-05-07T08:16:06.553781Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Sending email to test@example.com, is_shutting_down false, count 2
2025-05-07T08:16:08.556312Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Sending email to test@example.com, is_shutting_down false, count 3
2025-05-07T08:16:10.558700Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Sending email to test@example.com, is_shutting_down false, count 4
.......
2025-05-07T08:16:20.567444Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Shutting down email job
2025-05-07T08:16:20.567517Z INFO task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: email_service: Shut down email job
2025-05-07T08:16:20.567689Z DEBUG task{task_id="01JTMX1SZC9EHD3JQ95CD46BMD" attempt=1}: apalis::layers::tracing::on_response: task.done done_in=18018ms result=()
使用apalis-mysql说明BUG
相比apalis-redis
,触发BUG的场景在示例中仅替换了工作队列的Backend
形式,即由Reids
转为Mysql
。
- 在进程1中执行
mysql()
来启动一个Worker
订阅Email
。 - 在进程2中执行
mysql_producer()
来生产一条Email
数据。
运行后输出结果如下:
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.45s
Running `target/debug/example`
2025-05-07T08:17:30.010863Z DEBUG sqlx_mysql::connection::tls: not performing TLS upgrade: TLS support not compiled in
2025-05-07T08:17:30.063521Z DEBUG sqlx_mysql::connection::tls: not performing TLS upgrade: TLS support not compiled in
2025-05-07T08:17:30.063620Z DEBUG sqlx_mysql::connection::tls: not performing TLS upgrade: TLS support not compiled in
2025-05-07T08:17:42.757838Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=1}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:42.872156Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=2}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:42.984470Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=3}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:43.213097Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=4}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:43.321480Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=5}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:43.432377Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=6}: apalis::layers::tracing::on_request: task.start
2025-05-07T08:17:44.760357Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=1}: email_service: Sending email to test@example.com, is_shutting_down false, count 1
2025-05-07T08:17:44.874307Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=2}: email_service: Sending email to test@example.com, is_shutting_down false, count 1
...
2025-05-07T08:17:53.330770Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=5}: email_service: Shut down email job
2025-05-07T08:17:53.330820Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=5}: apalis::layers::tracing::on_response: task.done done_in=10009ms result=()
2025-05-07T08:17:53.443285Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=6}: email_service: Sending email to test@example.com, is_shutting_down true, count 5
2025-05-07T08:17:53.443420Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=6}: email_service: Shutting down email job
2025-05-07T08:17:53.443478Z INFO task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=6}: email_service: Shut down email job
2025-05-07T08:17:53.443541Z DEBUG task{task_id="01JTMX4VT0BZHV10TGEJXQ2RSB" attempt=6}: apalis::layers::tracing::on_response: task.done done_in=10011ms result=()
其中,一些日志打印的问题,开发者说明是由于我在两个示例中使用的.concurrency(8).enable_tracing()
顺序不同导致。但关键的BUG,在于attempt = 1,2,3..6
的出现:正常情况下,一个正在执行的Job
,不应该被多个Worker
执行,就像一封邮件无法被多次投递一样。在Apalis
中,Job
只有在两种场景下会被attempt
(理解为尝试执行):
Job
状态为Pending
(待执行)时;此时任务被提交,但是还未执行。Job
状态为Failed
(失败)时且执行次数不超过最大尝试次数;实际是一种失败重试机制。
源码calculate_status
:
pub fn calculate_status<Res>(ctx: &SqlContext, res: &Response<Res>) -> State {
match &res.inner {
Ok(_) => State::Done,
Err(e) => match &e {
Error::Abort(_) => State::Killed,
Error::Failed(_) if ctx.max_attempts() as usize <= res.attempt.current() => {
State::Killed
}
_ => State::Failed,
},
}
}
源码stream_jobs
:
let fetch_query = "SELECT id FROM Jobs
WHERE (status = 'Pending' OR (status = 'Failed' AND attempts < max_attempts)) AND run_at < ?1 AND job_type = ?2 ORDER BY priority DESC LIMIT ?3";
而正常Job
执行时,其正常状态应该为Running
(正在运行),不应该被多次尝试执行。
以上就是当前Apalis
版本的BUG基本信息,更多细节大家可以参考前面发出的与开发者沟通的Issue,或是查看源码Debug。这个BUG导致Apalis-mysql
当前版本是无法使用的,需要特别注意:)