1. 概述
apollo_scripts是Apollo自动驾驶平台的脚本管理模块,负责自动化构建、部署、运行和测试等功能。该模块包含了一系列shell脚本和Python工具,提供了环境配置、模块启动、设备初始化、代码质量检查等关键功能,是整个Apollo系统的重要支撑组件。这些脚本涵盖了从开发到部署的全生命周期管理,包括构建、测试、运行和维护等各个方面。
2. 软件架构图
graph TB
subgraph "用户接口层"
A1[命令行接口 CLI]
A2[Docker容器接口]
A3[IDE集成接口]
A4[Web管理界面接口]
end
subgraph "脚本管理层"
A[Apollo Scripts Manager]
subgraph "构建脚本组"
B[Build Scripts]
B1[apollo_buildify.sh - 代码格式化]
B2[apollo_action.sh - 构建动作管理]
B3[apollo_clean.sh - 清理构建产物]
B4[apollo_format.sh - 代码格式检查]
B5[buildifier.sh - BUILD文件格式化]
B6[clang_format.sh - C++代码格式化]
B7[yapf.sh - Python代码格式化]
B8[apollo_lint.sh - 代码规范检查]
B9[apollo_ci.sh - 持续集成]
end
subgraph "运行时脚本组"
C[Runtime Scripts]
C1[env.sh - 环境初始化]
C2[apollo_base.sh - 基础函数库]
C3[bootstrap.sh - 系统引导]
C4[cyber_launch.sh - Cyber组件启动]
C5[模块启动脚本群组]
C51[dreamview.sh - 可视化界面]
C52[perception.sh - 感知模块]
C53[localization.sh - 定位模块]
C54[planning.sh - 规划模块]
C55[control.sh - 控制模块]
C56[canbus.sh - CAN总线通信]
C57[routing.sh - 路由模块]
C58[prediction.sh - 预测模块]
end
subgraph "测试脚本组"
D[Test Scripts]
D1[replay.sh - 数据回放]
D2[record_bag.sh - 数据记录]
D3[performance_test.sh - 性能测试]
D4[unit_test_runner.sh - 单元测试]
D5[integration_test.sh - 集成测试]
D6[functional_test.sh - 功能测试]
end
subgraph "工具脚本组"
E[Utility Scripts]
E1[device_setup.sh - 硬件设备配置]
E2[data_management.sh - 数据管理]
E3[configuration_tools.sh - 配置管理]
E4[model_download.sh - 模型下载]
E5[map_tools.sh - 地图工具]
E6[common_functions.sh - 通用函数库]
E7[log_analyzer.sh - 日志分析]
E8[diagnostic_tool.sh - 诊断工具]
end
subgraph "配置管理组"
F[Configuration Scripts]
F1[apollo_config.sh - 系统配置]
F2[switch_vehicle.sh - 车辆切换]
F3[install_scripts.sh - 安装脚本]
F4[environment_setup.sh - 环境设置]
F5[vehicle_calibrations.sh - 车辆标定]
F6[security_config.sh - 安全配置]
end
subgraph "运维脚本组"
G[Maintenance Scripts]
G1[monitor.sh - 系统监控]
G2[log_management.sh - 日志管理]
G3[data_cleaner.sh - 数据清理]
G4[ota.sh - 在线更新]
G5[health_check.sh - 健康检查]
G6[backup_restore.sh - 备份恢复]
end
subgraph "部署脚本组"
H[Deployment Scripts]
H1[apollo_deploy.sh - 部署脚本]
H2[remote_deploy.sh - 远程部署]
H3[package_builder.sh - 包构建]
H4[image_creator.sh - 镜像创建]
H5[container_manager.sh - 容器管理]
end
end
subgraph "底层支撑层"
J[操作系统层 - Linux Ubuntu]
K[Docker容器引擎]
L[Bazel构建系统]
M[Python运行时环境]
N[Bash Shell环境]
O[Protobuf编译器]
P[CMake构建工具]
end
subgraph "外部依赖"
Q[硬件驱动程序]
R[传感器设备]
S[网络服务]
T[云服务平台]
end
A1 --> A
A2 --> A
A3 --> A
A4 --> A
A --> B
A --> C
A --> D
A --> E
A --> F
A --> G
A --> H
B --> L
B --> P
B --> O
C --> N
C --> M
C --> K
D --> K
D --> J
E --> J
E --> Q
F --> J
F --> S
G --> J
G --> T
H --> K
H --> S
H --> T
J -.-> A
K -.-> A
L -.-> A
M -.-> A
N -.-> A
O -.-> A
P -.-> A
Q -.-> E
R -.-> E
S -.-> F
T -.-> H
3. 调用流程图
flowchart TD
Start([用户启动Apollo]) --> PreCheck{检查运行环境}
PreCheck -->|环境正常| InitEnv[初始化环境变量]
PreCheck -->|环境异常| ErrorEnv[报告环境错误]
ErrorEnv --> End([结束])
InitEnv --> LoadConfig[加载系统配置]
LoadConfig --> CheckDocker{检查Docker环境}
CheckDocker -->|在Docker内| InDockerOps[容器内操作]
CheckDocker -->|在Docker外| OutDockerOps[容器外操作]
InDockerOps --> DetectArch[检测系统架构]
OutDockerOps --> DetectArch
DetectArch -->|x86_64| SetupX86[配置x86_64环境]
DetectArch -->|aarch64| SetupARM[配置ARM环境]
SetupX86 --> DeviceSetup[初始化硬件设备]
SetupARM --> DeviceSetup
DeviceSetup --> CreateDirs[创建必要目录结构]
CreateDirs --> SetupDevices[配置CAN/GPU设备]
SetupDevices --> VerifySetup{验证设备配置}
VerifySetup -->|配置成功| SystemReady[系统就绪]
VerifySetup -->|配置失败| RetrySetup[重试配置]
RetrySetup --> VerifySetup
SystemReady --> WaitForCmd{等待用户命令}
WaitForCmd -->|启动模块| StartModule[启动指定模块]
WaitForCmd -->|停止模块| StopModule[停止指定模块]
WaitForCmd -->|构建系统| BuildSystem[构建Apollo系统]
WaitForCmd -->|运行测试| RunTests[运行测试套件]
WaitForCmd -->|部署系统| DeploySystem[部署到目标]
WaitForCmd -->|监控系统| MonitorSystem[监控系统状态]
WaitForCmd -->|数据记录| RecordData[记录数据]
WaitForCmd -->|清理数据| CleanData[清理历史数据]
%% 模块启动流程
StartModule --> ParseModule[解析模块参数]
ParseModule --> CheckModule{检查模块状态}
CheckModule -->|模块未运行| LaunchModule[启动模块]
CheckModule -->|模块已运行| NotifyRunning[通知模块已在运行]
NotifyRunning --> ReturnReady[返回系统就绪]
LaunchModule --> FindLaunchFile[查找启动配置文件]
FindLaunchFile --> ExecuteLaunch[执行cyber_launch启动]
ExecuteLaunch --> VerifyLaunch{验证启动状态}
VerifyLaunch -->|启动成功| LogSuccess[记录成功日志]
VerifyLaunch -->|启动失败| LogFailure[记录失败日志]
LogSuccess --> ReturnReady
LogFailure --> ReturnReady
%% 模块停止流程
StopModule --> IdentifyProcess[识别模块进程]
IdentifyProcess --> KillProcess[终止模块进程]
KillProcess --> VerifyStop{验证停止状态}
VerifyStop -->|已停止| LogStop[记录停止日志]
VerifyStop -->|未停止| ForceKill[强制终止]
ForceKill --> VerifyStop
LogStop --> ReturnReady
%% 构建系统流程
BuildSystem --> ParseBuildArgs[解析构建参数]
ParseBuildArgs --> CheckBuildEnv{检查构建环境}
CheckBuildEnv -->|环境正常| CleanBuild[清理构建缓存]
CheckBuildEnv -->|环境异常| SetupBuildEnv[设置构建环境]
SetupBuildEnv --> CleanBuild
CleanBuild --> DetermineTargets[确定构建目标]
DetermineTargets --> ExecuteBuild[执行Bazel构建]
ExecuteBuild --> VerifyBuild{验证构建结果}
VerifyBuild -->|构建成功| PostBuild[构建后处理]
VerifyBuild -->|构建失败| ReportBuildErr[报告构建错误]
PostBuild --> ReturnReady
ReportBuildErr --> ReturnReady
%% 测试流程
RunTests --> SetupTestEnv[设置测试环境]
SetupTestEnv --> RunUnitTest[运行单元测试]
RunUnitTest --> RunIntegrationTest[运行集成测试]
RunIntegrationTest --> RunFunctionalTest[运行功能测试]
RunFunctionalTest --> GenTestReport[生成测试报告]
GenTestReport --> ReturnReady
%% 部署流程
DeploySystem --> ValidateTarget[验证部署目标]
ValidateTarget --> PreparePackage[准备部署包]
PreparePackage --> TransferPackage[传输部署包]
TransferPackage --> InstallPackage[安装部署包]
InstallPackage --> ConfigureDeploy[配置部署环境]
ConfigureDeploy --> VerifyDeploy{验证部署结果}
VerifyDeploy -->|部署成功| LogDeploy[记录部署成功]
VerifyDeploy -->|部署失败| Rollback[回滚部署]
LogDeploy --> ReturnReady
Rollback --> ReturnReady
%% 监控流程
MonitorSystem --> CollectMetrics[收集系统指标]
CollectMetrics --> AnalyzeData[分析指标数据]
AnalyzeData --> CheckThresholds{检查阈值}
CheckThresholds -->|正常| ContinueMonitor[继续监控]
CheckThresholds -->|异常| RaiseAlert[发出警报]
ContinueMonitor --> CollectMetrics
RaiseAlert --> NotifyAdmin[通知管理员]
NotifyAdmin --> ContinueMonitor
%% 数据记录流程
RecordData --> DecideStorage[决定存储位置]
DecideStorage --> CreateTaskDir[创建任务目录]
CreateTaskDir --> SelectChannels[选择记录通道]
SelectChannels --> StartRecording[开始记录数据]
StartRecording --> MonitorDisk{监控磁盘空间}
MonitorDisk -->|空间充足| ContinueRecord[继续记录]
MonitorDisk -->|空间不足| StopRecord[停止记录]
ContinueRecord --> MonitorDisk
StopRecord --> CompressData[压缩数据]
CompressData --> ReturnReady
%% 数据清理流程
CleanData --> ScanOldData[扫描旧数据]
ScanOldData --> FilterData[筛选待清理数据]
FilterData --> ConfirmClean[确认清理操作]
ConfirmClean --> ExecuteClean[执行清理]
ExecuteClean --> VerifyClean{验证清理结果}
VerifyClean -->|清理成功| LogClean[记录清理日志]
VerifyClean -->|清理失败| ReportCleanErr[报告清理错误]
LogClean --> ReturnReady
ReportCleanErr --> ReturnReady
ReturnReady --> WaitForCmd
subgraph "核心流程"
CoreFlow[SystemReady]
end
subgraph "错误处理流程"
ErrFlow[Error Handling]
ErrFlow --> LogError[记录错误]
ErrFlow --> AttemptRecovery[尝试恢复]
ErrFlow --> NotifyUser[通知用户]
end
LogFailure -.-> ErrFlow
ReportBuildErr -.-> ErrFlow
ReportCleanErr -.-> ErrFlow
4. UML类图
classDiagram
%% 基础抽象层
class ApolloScriptBase {
<<abstract>>
+String TOP_DIR
+String APOLLO_ROOT_DIR
+String ARCH
+Boolean APOLLO_IN_DOCKER
+int APOLLO_OUTSIDE_DOCKER
+String CMDLINE_OPTIONS
+Boolean ENABLE_PROFILER
+String APOLLO_BIN_PREFIX
+Map env_vars
+
+initialize_environment()
+set_lib_path()
+create_data_dir()
+determine_bin_prefix()
+setup_device()
+decide_task_dir()
+check_in_docker()
+pathprepend(String var, String value)
+pathappend(String var, String value)
+info(String msg)
+warning(String msg)
+error(String msg)
+ok(String msg)
+fatal(String msg)
+check_function_exists(String func_name)
+is_stopped_customized_path(String module_path, String module)
}
%% 构建系统层
class BuildSystem {
+String DISABLED_TARGETS
+String SHORTHAND_TARGETS
+int USE_GPU
+Boolean USE_ESD_CAN
+int USE_OPT
+String BUILD_TYPE
+
+determine_build_targets(String... components)
+determine_disabled_targets(String... components)
+_chk_n_set_gpu_arg(String arg)
+_determine_perception_disabled()
+build(String... targets)
+clean()
+verify_build()
+setup_build_environment()
+configure_build_options()
}
class BuildOptimizer {
+int MAX_JOBS
+String BUILD_CACHE_DIR
+Boolean USE_INCREMENTAL_BUILD
+
+optimize_build_performance()
+enable_cache_mechanism()
+limit_concurrent_jobs()
}
%% 模块管理层
class ModuleLauncher {
+String LAUNCH_FILE_PATH
+String MODULE_STATUS
+
+start(String module, String... args)
+start_customized_path(String module_path, String module, String... args)
+stop(String module)
+check_module_status(String module)
+wait_for_exit(String module)
+list_running_modules()
}
class ModuleRegistry {
+Map registered_modules
+List essential_modules
+
+register_module(ModuleInfo info)
+unregister_module(String module_name)
+get_module_info(String module_name)
+get_essential_modules()
+validate_module_dependencies()
}
class ModuleInfo {
+String name
+String path
+String launch_file
+List dependencies
+Boolean is_essential
+String description
+
+ModuleInfo(String name, String path, String launch_file)
+getName()
+getPath()
+getLaunchFile()
+getDependencies()
}
%% 设备管理层
class DeviceSetup {
+String CAN_DEVICE_PATTERN
+String GPU_DEVICE_PATTERN
+int NUM_CAN_PORTS
+
+setup_device_for_amd64()
+setup_device_for_aarch64()
+setup_can_devices()
+check_gpu_devices()
+setup_shared_mem()
+initialize_hardware()
+validate_device_access()
}
class HardwareValidator {
+List required_devices
+Map device_paths
+
+validate_required_hardware()
+check_device_permissions()
+test_device_functionality()
+generate_hardware_report()
}
%% 配置管理层
class ConfigManager {
+String VEHICLE_NAME
+String BRIDGE_PORT
+String DASHBOARD_PORT
+String CONFIG_DIR
+
+load_config()
+validate_config()
+apply_config()
+save_config()
+switch_vehicle(String vehicle_id)
+validate_vehicle_config(String vehicle_id)
}
class VehicleConfig {
+String vehicle_id
+String model
+String calibration_file
+Map parameters
+
+VehicleConfig(String vehicle_id)
+getCalibrationFile()
+getParameter(String key)
+setParameter(String key, Object value)
+validate()
}
%% 数据管理层
class DataManager {
+String BAG_PATH
+String LOG_PATH
+String TASK_DIR
+String DATA_RETENTION_DAYS
+
+manage_logs()
+clean_data()
+backup_data()
+record_bag(List channels)
+stop_record()
+rotate_logs()
+compress_old_data()
}
class DataRecorder {
+String RECORDING_TASK_ID
+String CURRENT_BAG_FILE
+Boolean is_recording
+
+start_recording(List channels)
+stop_recording()
+pause_recording()
+resume_recording()
+get_recording_status()
}
%% 测试管理层
class TestRunner {
+String TEST_FILTER
+String TEST_TIMEOUT
+String TEST_REPORT_DIR
+
+run_unit_tests()
+run_integration_tests()
+run_functional_tests()
+generate_test_report()
+analyze_coverage()
+validate_test_results()
}
class TestCaseManager {
+List test_cases
+TestResultAggregator aggregator
+
+add_test_case(TestCase tc)
+run_all_tests()
+get_test_results()
+generate_coverage_report()
}
class TestCase {
+String name
+String description
+String command
+int timeout
+
+TestCase(String name, String command)
+execute()
+getName()
+getTimeout()
}
%% 部署管理层
class DeploymentManager {
+String TARGET_HOST
+String DEPLOY_PATH
+String DEPLOY_PACKAGE
+
+deploy_to_remote()
+rollback_version()
+verify_deployment()
+update_config()
+check_target_compatibility()
}
class PackageBuilder {
+String PACKAGE_FORMAT
+List components
+String OUTPUT_DIR
+
+create_package(List components)
+extract_package(String path)
+verify_package_integrity()
+install_package(String package_path)
+calculate_checksum(String file_path)
}
%% 监控系统层
class MonitorSystem {
+String METRICS_INTERVAL
+Map system_metrics
+List alert_handlers
+
+collect_cpu_usage()
+collect_memory_usage()
+collect_disk_usage()
+collect_network_stats()
+send_alert(String message)
+log_event(String event)
+start_monitoring()
+stop_monitoring()
}
class MetricsCollector {
+SystemMetrics current_metrics
+List sources
+
+collect_system_metrics()
+collect_process_metrics()
+collect_network_metrics()
+aggregate_metrics()
}
class AlertHandler {
+String handler_type
+String destination
+
+handle_alert(Alert alert)
+send_notification(String message)
+log_alert(Alert alert)
}
%% 执行管理层
class ScriptExecutor {
+String current_command
+ExecutionResult last_result
+
+execute_command(String cmd)
+handle_error(Error error)
+log_operation(String operation)
+validate_execution_env()
}
class ExecutionResult {
+int exit_code
+String stdout
+String stderr
+long execution_time
+
+ExecutionResult(int code, String out, String err)
+isSuccessful()
+getExitCode()
+getStdout()
+getStderr()
}
%% 主控制器
class MainController {
+BuildSystem build_system
+ModuleLauncher module_launcher
+ConfigManager config_manager
+DataManager data_manager
+TestRunner test_runner
+DeploymentManager deployment_manager
+MonitorSystem monitor_system
+PackageBuilder package_builder
+
+initialize_system()
+process_command(String[] args)
+manage_lifecycle()
+handle_shutdown()
}
%% 继承关系
ApolloScriptBase <|-- BuildSystem
ApolloScriptBase <|-- ModuleLauncher
ApolloScriptBase <|-- DeviceSetup
ApolloScriptBase <|-- ConfigManager
ApolloScriptBase <|-- DataManager
ApolloScriptBase <|-- TestRunner
ApolloScriptBase <|-- DeploymentManager
ApolloScriptBase <|-- MonitorSystem
%% 关联关系
BuildSystem --> BuildOptimizer : uses
ModuleLauncher --> ModuleRegistry : manages
ModuleRegistry --> ModuleInfo : contains
DeviceSetup --> HardwareValidator : uses
ConfigManager --> VehicleConfig : manages
DataManager --> DataRecorder : uses
TestRunner --> TestCaseManager : uses
TestCaseManager --> TestCase : contains
DeploymentManager --> PackageBuilder : uses
MonitorSystem --> MetricsCollector : uses
MonitorSystem --> AlertHandler : uses
ScriptExecutor --> ExecutionResult : creates
%% 主控制器关联
MainController --> BuildSystem : orchestrates
MainController --> ModuleLauncher : orchestrates
MainController --> DeviceSetup : orchestrates
MainController --> ConfigManager : orchestrates
MainController --> DataManager : orchestrates
MainController --> TestRunner : orchestrates
MainController --> DeploymentManager : orchestrates
MainController --> MonitorSystem : orchestrates
MainController --> PackageBuilder : orchestrates
MainController --> ScriptExecutor : uses
5. 状态机
stateDiagram-v2
[*] --> SystemInit : 启动脚本
SystemInit --> EnvSetup : 初始化环境变量
EnvSetup --> CheckDocker : 检查Docker环境
CheckDocker -->|在容器内| InDockerState : 设置容器环境
CheckDocker -->|在容器外| OutDockerState : 设置宿主环境
InDockerState --> DetectPlatform : 检测平台架构
OutDockerState --> DetectPlatform
DetectPlatform -->|x86_64| SetupAMD64 : 配置x86_64环境
DetectPlatform -->|aarch64| SetupARM64 : 配置ARM64环境
SetupAMD64 --> DeviceInitialization : 初始化设备
SetupARM64 --> DeviceInitialization
DeviceInitialization --> CreateDataDirs : 创建数据目录
CreateDataDirs --> SetupHardware : 配置硬件设备
SetupHardware --> SystemReady : 系统就绪
SystemReady --> WaitForCommand : 等待用户命令
WaitForCommand -->|构建命令| BuildProcess : 开始构建
WaitForCommand -->|启动模块| ModuleStart : 启动模块
WaitForCommand -->|停止模块| ModuleStop : 停止模块
WaitForCommand -->|运行测试| TestProcess : 运行测试
WaitForCommand -->|部署命令| DeployProcess : 执行部署
WaitForCommand -->|监控命令| MonitorProcess : 开始监控
WaitForCommand -->|数据记录| RecordProcess : 开始记录
WaitForCommand -->|清理命令| CleanProcess : 执行清理
%% 构建过程状态
state BuildProcess {
[*] --> ParseArgs : 解析参数
ParseArgs --> ValidateEnv : 验证环境
ValidateEnv -->|环境有效| PrepareBuild : 准备构建
ValidateEnv -->|环境无效| BuildError : 环境错误
PrepareBuild --> SelectTargets : 选择构建目标
SelectTargets --> ExecuteBazel : 执行Bazel构建
ExecuteBazel -->|构建成功| PostBuild : 构建后处理
ExecuteBazel -->|构建失败| BuildError : 构建错误
PostBuild --> BuildComplete : 构建完成
BuildError --> [*]
BuildComplete --> [*]
}
%% 模块启动状态
state ModuleStart {
[*] --> ParseModuleArgs : 解析模块参数
ParseModuleArgs --> CheckModuleStatus : 检查模块状态
CheckModuleStatus -->|模块已运行| ModuleRunning : 模块已在运行
CheckModuleStatus -->|模块未运行| LocateLaunchFile : 查找启动文件
ModuleRunning --> [*]
LocateLaunchFile --> LaunchViaCyber : 通过cyber_launch启动
LaunchViaCyber --> WaitLaunchResult : 等待启动结果
WaitLaunchResult -->|启动成功| VerifyModule : 验证模块状态
WaitLaunchResult -->|启动失败| ModuleStartError : 启动错误
VerifyModule -->|验证通过| ModuleStarted : 模块启动成功
VerifyModule -->|验证失败| ModuleStartError : 验证失败
ModuleStartError --> [*]
ModuleStarted --> [*]
}
%% 模块停止状态
state ModuleStop {
[*] --> IdentifyModule : 识别模块
IdentifyModule --> FindProcess : 查找进程
FindProcess -->|找到进程| KillProcess : 终止进程
FindProcess -->|未找到进程| ModuleNotRunning : 模块未运行
KillProcess --> VerifyStop : 验证停止状态
VerifyStop -->|已停止| ModuleStopped : 模块已停止
VerifyStop -->|未停止| ForceKill : 强制终止
ForceKill --> VerifyStop
ModuleNotRunning --> [*]
ModuleStopped --> [*]
}
%% 测试过程状态
state TestProcess {
[*] --> SetupTestEnv : 设置测试环境
SetupTestEnv --> RunUnitTests : 运行单元测试
RunUnitTests -->|通过| RunIntegrationTests : 运行集成测试
RunUnitTests -->|失败| TestsFailed : 测试失败
RunIntegrationTests -->|通过| RunFunctionalTests : 运行功能测试
RunIntegrationTests -->|失败| TestsFailed : 测试失败
RunFunctionalTests -->|通过| GenerateReports : 生成报告
RunFunctionalTests -->|失败| TestsFailed : 测试失败
GenerateReports --> TestsComplete : 测试完成
TestsFailed --> [*]
TestsComplete --> [*]
}
%% 部署过程状态
state DeployProcess {
[*] --> ValidateTarget : 验证部署目标
ValidateTarget -->|有效| PreparePackage : 准备部署包
ValidateTarget -->|无效| DeployError : 部署目标错误
PreparePackage --> TransferPackage : 传输部署包
TransferPackage -->|成功| InstallPackage : 安装部署包
TransferPackage -->|失败| DeployError : 传输失败
InstallPackage -->|成功| ConfigureSystem : 配置系统
InstallPackage -->|失败| DeployError : 安装失败
ConfigureSystem -->|成功| VerifyDeploy : 验证部署
ConfigureSystem -->|失败| DeployError : 配置失败
VerifyDeploy -->|成功| DeploySuccess : 部署成功
VerifyDeploy -->|失败| DeployError : 验证失败
DeployError --> [*]
DeploySuccess --> [*]
}
%% 监控过程状态
state MonitorProcess {
[*] --> InitializeMonitors : 初始化监控器
InitializeMonitors --> CollectMetrics : 收集指标
CollectMetrics --> AnalyzeData : 分析数据
AnalyzeData --> CheckThresholds : 检查阈值
CheckThresholds -->|正常| ContinueMonitor : 继续监控
CheckThresholds -->|异常| TriggerAlert : 触发警报
ContinueMonitor --> CollectMetrics : 循环收集
TriggerAlert --> NotifyAdmin : 通知管理员
NotifyAdmin --> ContinueMonitor
}
%% 记录过程状态
state RecordProcess {
[*] --> SelectChannels : 选择记录通道
SelectChannels --> CreateTaskDir : 创建任务目录
CreateTaskDir --> StartBagRecord : 开始bag记录
StartBagRecord --> MonitorDiskUsage : 监控磁盘使用
MonitorDiskUsage -->|空间充足| ContinueRecord : 继续记录
MonitorDiskUsage -->|空间不足| StopAndAlert : 停止并警报
ContinueRecord --> MonitorDiskUsage : 循环监控
StopAndAlert --> CompressData : 压缩数据
CompressData --> RecordComplete : 记录完成
RecordComplete --> [*]
}
%% 清理过程状态
state CleanProcess {
[*] --> ScanData : 扫描数据
ScanData --> IdentifyOldData : 识别旧数据
IdentifyOldData --> ConfirmClean : 确认清理
ConfirmClean --> ExecuteClean : 执行清理
ExecuteClean --> VerifyClean : 验证清理
VerifyClean -->|成功| CleanComplete : 清理完成
VerifyClean -->|失败| CleanError : 清理错误
CleanError --> [*]
CleanComplete --> [*]
}
%% 错误处理状态
state ErrorHandling {
[*] --> LogError : 记录错误
LogError --> AssessSeverity : 评估严重性
AssessSeverity -->|致命错误| SystemShutdown : 系统关闭
AssessSeverity -->|一般错误| AttemptRecovery : 尝试恢复
AttemptRecovery -->|恢复成功| ReturnToReady : 返回就绪
AttemptRecovery -->|恢复失败| SystemShutdown : 系统关闭
SystemShutdown --> [*]
ReturnToReady --> [*]
}
%% 连接错误处理
BuildProcess --> ErrorHandling : 构建错误
ModuleStart --> ErrorHandling : 启动错误
ModuleStop --> ErrorHandling : 停止错误
TestProcess --> ErrorHandling : 测试错误
DeployProcess --> ErrorHandling : 部署错误
RecordProcess --> ErrorHandling : 记录错误
CleanProcess --> ErrorHandling : 清理错误
%% 返回系统就绪状态
BuildComplete --> SystemReady
ModuleStarted --> SystemReady
ModuleStopped --> SystemReady
TestsComplete --> SystemReady
DeploySuccess --> SystemReady
RecordComplete --> SystemReady
CleanComplete --> SystemReady
ReturnToReady --> SystemReady
ModuleRunning --> SystemReady
ModuleNotRunning --> SystemReady
%% 系统关闭状态
SystemReady --> SystemShutdown : 接收关闭信号
ErrorHandling --> SystemShutdown : 系统错误关闭
SystemShutdown --> [*]
6. 源码分析
6.1. 核心初始化脚本
6.1.1. apollo_base.sh 初始化流程
apollo_base.sh是所有Apollo脚本的基础,它负责初始化环境变量和定义通用函数。
#!/usr/bin/env bash
TOP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd -P)"
source ${TOP_DIR}/scripts/apollo.bashrc
ARCH="$(uname -m)"
APOLLO_OUTSIDE_DOCKER=0
CMDLINE_OPTIONS=
SHORTHAND_TARGETS=
DISABLED_TARGETS=
: ${CROSSTOOL_VERBOSE:=0}
: ${NVCC_VERBOSE:=0}
: ${HIPCC_VERBOSE:=0}
: ${USE_ESD_CAN:=false}
USE_GPU=-1
use_cpu=-1
use_gpu=-1
use_nvidia=-1
use_amd=-1
ENABLE_PROFILER=true
初始化流程主要包括:
- 设置顶级目录路径
- 加载基础bash配置
- 检测系统架构
- 初始化各种标志和配置
6.1.2. 环境路径配置
function set_lib_path() {
local CYBER_SETUP="${APOLLO_ROOT_DIR}/cyber/setup.bash"
[ -e "${CYBER_SETUP}" ] && . "${CYBER_SETUP}"
pathprepend ${APOLLO_ROOT_DIR}/modules/tools PYTHONPATH
pathprepend ${APOLLO_ROOT_DIR}/modules/teleop/common PYTHONPATH
pathprepend /apollo/modules/teleop/common/scripts
}
该函数配置Python模块路径,确保模块能够正确导入。它首先检查CyberRT的环境配置文件是否存在,如果存在则加载该文件,然后将相关模块路径添加到PYTHONPATH中。
6.1.3. 数据目录创建
function create_data_dir() {
local DATA_DIR="${APOLLO_ROOT_DIR}/data"
mkdir -p "${DATA_DIR}/log"
mkdir -p "${DATA_DIR}/bag"
mkdir -p "${DATA_DIR}/core"
}
创建必要的数据目录,包括日志、数据包和核心转储目录。
6.2. 设备初始化机制
6.2.1. 设备初始化流程
根据系统架构类型,脚本会调用不同的设备初始化函数:
function setup_device() {
if [ "$(uname -s)" != "Linux" ]; then
info "Not on Linux, skip mapping devices."
return
fi
if [[ "${ARCH}" == "x86_64" ]]; then
setup_device_for_amd64
else
setup_device_for_aarch64
fi
}
6.2.2. x86_64 架构设备初始化
function setup_device_for_amd64() {
# setup CAN device
local NUM_PORTS=8
for i in $(seq 0 $((${NUM_PORTS} - 1))); do
if [[ -e /dev/can${i} ]]; then
continue
elif [[ -e /dev/zynq_can${i} ]]; then
# soft link if sensorbox exist
sudo ln -s /dev/zynq_can${i} /dev/can${i}
else
break
# sudo mknod --mode=a+rw /dev/can${i} c 52 ${i}
fi
done
# Check Nvidia device
if [[ ! -e /dev/nvidia0 ]]; then
warning "No device named /dev/nvidia0"
fi
if [[ ! -e /dev/nvidiactl ]]; then
warning "No device named /dev/nvidiactl"
fi
if [[ ! -e /dev/nvidia-uvm ]]; then
warning "No device named /dev/nvidia-uvm"
fi
if [[ ! -e /dev/nvidia-uvm-tools ]]; then
warning "No device named /dev/nvidia-uvm-tools"
fi
if [[ ! -e /dev/nvidia-modeset ]]; then
warning "No device named /dev/nvidia-modeset"
fi
}
该函数初始化CAN设备节点,为自动驾驶车辆的通信做准备,同时检查NVIDIA GPU设备的存在。
6.2.3. aarch64 架构设备初始化
function setup_device_for_aarch64() {
local can_dev="/dev/can0"
local socket_can_dev="can0"
if [ ! -e "${can_dev}" ]; then
warning "No CAN device named ${can_dev}. "
fi
if [[ -x "$(command -v ip)" ]]; then
if ! ip link show type can | grep "${socket_can_dev}" &> /dev/null; then
warning "No SocketCAN device named ${socket_can_dev}."
else
sudo modprobe can
sudo modprobe can_raw
sudo modprobe mttcan
sudo ip link set "${socket_can_dev}" type can bitrate 500000 sjw 4 berr-reporting on loopback off
sudo ip link set up "${socket_can_dev}"
fi
else
warning "ip command not found."
fi
}
6.3. 模块管理机制
6.3.1. 模块启动流程
模块启动的核心函数:
function start_customized_path() {
MODULE_PATH=$1
MODULE=$2
shift 2
is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
if [ $? -eq 1 ]; then
# todo(zero): Better to move nohup.out to data/log/nohup.out
eval "nohup cyber_launch start ${APOLLO_ROOT_DIR}/modules/${MODULE_PATH}/launch/${MODULE}.launch &"
sleep 0.5
is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
if [ $? -eq 0 ]; then
ok "Launched module ${MODULE}."
return 0
else
error "Could not launch module ${MODULE}. Is it already built?"
return 1
fi
else
info "Module ${MODULE} is already running - skipping."
return 2
fi
}
6.3.2. 模块状态检查
function is_stopped_customized_path() {
MODULE_PATH=$1
MODULE=$2
NUM_PROCESSES="$(pgrep -f "modules/${MODULE_PATH}/launch/${MODULE}.launch" | grep -cv '^1$')"
if [ "${NUM_PROCESSES}" -eq 0 ]; then
return 1
else
return 0
fi
}
该函数检查模块是否处于停止状态。
6.4. 构建系统实现
6.4.1. 构建目标确定
function determine_build_targets() {
local targets_all
if [[ "$#" -eq 0 ]]; then
targets_all="$(python3 ${TOP_DIR}/scripts/find_all_package.py)"
echo "${targets_all}"
return
fi
for component in $@; do
local build_targets
if [ "${component}" = "cyber" ]; then
build_targets="cyber"
elif [[ -d "${TOP_DIR}/modules/${component}" ]]; then
build_targets="modules/${component}"
else
error "Directory ${TOP_DIR}/modules/${component} not found. Exiting ..."
exit 1
fi
if [ -z "${targets_all}" ]; then
targets_all="${build_targets}"
else
targets_all="${targets_all} ${build_targets}"
fi
done
echo "${targets_all}"
}
6.4.2. 构建参数处理
脚本使用getopts处理构建参数:
while getopts "cdef:g:hij:mn:pt:uv" opt; do
case $opt in
c)
ACTION=clean
;;
d)
if [ -z "${SHORTHAND_TARGETS}" ]; then
SHORTHAND_TARGETS="all"
fi
USE_DBG=1
;;
e)
ENABLE_PROFILER=false
;;
f)
ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --compilation_mode=${OPTARG}"
;;
g)
ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --cxxopt=-g${OPTARG}"
;;
h)
usage
exit 0
;;
i)
USE_OPT=1
;;
j)
ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} -j${OPTARG}"
;;
m)
USE_GPU=0
;;
n)
ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --jobs=${OPTARG}"
;;
p)
ACTION=build
;;
t)
if [ -z "${SHORTHAND_TARGETS}" ]; then
SHORTHAND_TARGETS="all"
fi
ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --test_timeout=${OPTARG}"
;;
u)
USE_GPU=1
;;
v)
set -x
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
exit 1
;;
esac
done
6.5. 配置管理脚本
6.5.1. 车辆配置切换
实现了车辆配置的动态切换:
function switch_vehicle() {
local vehicle_id=$1
local vehicle_dir="${APOLLO_ROOT_DIR}/modules/calibration/data/${vehicle_id}"
if [ ! -d "${vehicle_dir}" ]; then
error "Invalid vehicle id: ${vehicle_id}. Directory does not exist: ${vehicle_dir}"
usage
fi
# Create symbolic links for calibration data
rm -rf ${APOLLO_ROOT_DIR}/modules/calibration/data/current
ln -s ${vehicle_dir} ${APOLLO_ROOT_DIR}/modules/calibration/data/current
ok "Successfully switched to vehicle: ${vehicle_id}"
}
7. 设计模式
7.1. 模板方法模式
function start_customized_path() {
MODULE_PATH=$1
MODULE=$2
shift 2
is_stopped_customized_path "${MODULE_PATH}" "${MODULE}" # 检查状态
if [ $? -eq 1 ]; then # 算法骨架
eval "nohup cyber_launch start ${APOLLO_ROOT_DIR}/modules/${MODULE_PATH}/launch/${MODULE}.launch &"
sleep 0.5
is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
if [ $? -eq 0 ]; then
ok "Launched module ${MODULE}."
return 0
else
error "Could not launch module ${MODULE}. Is it already built?"
return 1
fi
else
info "Module ${MODULE} is already running - skipping."
return 2
fi
}
这个函数定义了启动模块的通用流程,但具体的模块名称和路径可以由子类(即具体的模块启动脚本)来定制。
7.2. 策略模式
在设备初始化中,Apollo Scripts使用了策略模式来处理不同架构的设备初始化:
function setup_device() {
if [ "$(uname -s)" != "Linux" ]; then
info "Not on Linux, skip mapping devices."
return
fi
if [[ "${ARCH}" == "x86_64" ]]; then
setup_device_for_amd64 # x86_64策略
else
setup_device_for_aarch64 # aarch64策略
fi
}
这里,setup_device_for_amd64和setup_device_for_aarch64是两种不同的设备设置策略,系统根据当前架构选择合适的策略执行。
7.3. 工厂模式
构建系统使用工厂模式来创建不同的构建目标:
function determine_build_targets() {
# ...
for component in $@; do
local build_targets
if [ "${component}" = "cyber" ]; then
build_targets="cyber"
elif [[ -d "${TOP_DIR}/modules/${component}" ]]; then
build_targets="modules/${component}"
else
error "Directory ${TOP_DIR}/modules/${component} not found. Exiting ..."
exit 1
fi
# ...
done
# ...
}
根据不同的输入参数,函数创建不同的构建目标,这正是工厂模式的体现。
7.4. 单例模式
环境变量和全局配置在整个脚本系统中只初始化一次,后续脚本直接使用,体现了单例模式:
TOP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd -P)"
这种模式确保了全局状态的一致性。
7.5. 适配器模式
if [ -f /.dockerenv ]; then
APOLLO_IN_DOCKER=true
else
APOLLO_IN_DOCKER=false
fi
该脚本检测当前运行环境(Docker容器内或外),并提供统一的环境变量接口。
7.6. 命令模式
模块管理中使用命令模式将操作封装为对象:
function start() {
MODULE=$1
shift
start_customized_path $MODULE $MODULE "$@"
}
function stop() {
MODULE=$1
pkill -f "modules/${MODULE}/launch/${MODULE}.launch" || true
sleep 1
}
7.7. 观察者模式
监控脚本实现观察者模式,监听系统事件并做出反应:
while true; do
check_system_status
check_module_health
sleep $MONITOR_INTERVAL
done
监控系统作为观察者,定期检查系统状态和模块健康状况。
这些设计模式的运用使得Apollo Scripts具有良好的可扩展性、可维护性和灵活性,为Apollo自动驾驶平台提供了可靠的脚本支持。
8. 总结
apollo_scripts模块通过一系列精心设计的shell脚本,实现了Apollo系统的自动化构建、部署、运行和测试。其设计合理,模块化程度高,通过基础脚本提供通用功能,特定脚本完成专门任务,形成了一个完整的脚本生态系统。