第12章: 工业级实践 — 从脚本到工程
Java/Kotlin 开发者习惯了 Maven/Gradle 的标准化工程体系:约定目录结构、声明式依赖、生命周期管理、插件生态。Python 生态长期碎片化,但 2020 年代后逐渐收敛到 pyproject.toml + 现代工具链。本章从项目结构出发,覆盖依赖管理、代码质量、测试策略、CI/CD、安全实践和可观测性,帮你把 Python 从"写脚本"提升到"工程交付"。
12.1 项目结构
Java/Kotlin 对比
my-project/
├── pom.xml
├── src/
│ ├── main/
│ │ ├── java/com/example/app/
│ │ └── resources/
│ └── test/
│ ├── java/com/example/app/
│ └── resources/
├── .mvn/
└── mvnw / mvnw.cmd
my-project/
├── build.gradle.kts
├── settings.gradle.kts
├── src/
│ ├── main/kotlin/com/example/
│ ├── main/resources/
│ ├── test/kotlin/com/example/
│ └── test/resources/
└── gradle/
└── wrapper/
Python 实现
import sys
from my_package.core import main
if __name__ == "__main__":
sys.exit(main())
> pyproject.toml 的完整字段说明详见 [1.4 pyproject.toml](01-environment-tooling.md),本节聚焦工业级项目的配置实践。
"""
[build-system]
# 构建后端: 用什么工具把源码打包
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "my-package"
version = "1.0.0"
description = "A production-ready Python package"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT"}
authors = [
{name = "Developer", email = "dev@example.com"},
]
# 依赖 — 相当于 Maven 的 <dependencies>
dependencies = [
"pydantic>=2.0",
"httpx>=0.25",
"structlog>=23.0",
]
# 可选依赖分组 — 相当于 Maven 的 <scope>
[project.optional-dependencies]
dev = [
"ruff>=0.4",
"mypy>=1.8",
"pre-commit>=3.6",
]
test = [
"pytest>=8.0",
"pytest-asyncio>=0.23",
"pytest-cov>=5.0",
"hypothesis>=6.100",
]
prod = [
"gunicorn>=22.0",
"uvicorn[standard]>=0.30",
]
# 入口点 — 相当于 Maven 的 mainClass
[project.scripts]
my-cli = "my_package.core:main"
# 工具配置区域 — 所有工具共享这一个文件
[tool.ruff]
target-version = "py310"
line-length = 100
[tool.mypy]
python_version = "3.10"
strict = true
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
"""
"""
[project]
name = "my-monorepo"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = []
[tool.uv.workspace]
members = ["packages/*"]
[tool.uv.sources]
# 包间依赖直接引用本地路径
core = { workspace = true }
api = { workspace = true }
"""
"""
[project]
name = "core"
version = "0.1.0"
dependencies = ["pydantic>=2.0"]
"""
"""
[project]
name = "api"
version = "0.1.0"
dependencies = ["core", "fastapi>=0.110"]
"""
核心差异
| 维度 | Maven/Gradle | Python (pyproject.toml) |
|---|
| 配置文件 | pom.xml / build.gradle.kts | pyproject.toml(统一) |
| 目录约定 | 强制 src/main/java | 推荐 src layout,不强制 |
| 包管理 | 单一(Maven Central) | PyPI + 私有仓库 |
| 多模块 | 原生支持 | uv workspace / pip -e |
| 构建生命周期 | compile → test → package | 无标准生命周期,按工具各自运行 |
| 入口 | mainClass / public static void | __main__.py + [project.scripts] |
常见陷阱
何时使用
- src layout: 所有可安装的包(99% 的情况)
- flat layout: 仅限纯脚本项目(不打包、不发布)
- monorepo: 多个紧密耦合的包共享开发
12.2 依赖管理
Java/Kotlin 对比
<dependencies>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.17.0</version>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>5.10.2</version>
<scope>test</scope>
</dependency>
</dependencies>
dependencies {
implementation("com.fasterxml.jackson.module:jackson-module-kotlin:2.17.0")
testImplementation("org.junit.jupiter:junit-jupiter:5.10.2")
api("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.8.0")
}
Python 实现
"""
[project]
dependencies = [
"pydantic>=2.0,<3",
"httpx>=0.25,<1",
]
[dependency-groups]
dev = [
"pytest>=8.0,<9",
"ruff>=0.4,<1",
]
"""
"""
[tool.poetry.dependencies]
python = "^3.10"
pydantic = "^2.0"
httpx = "^0.25"
[tool.poetry.group.dev.dependencies]
pytest = "^8.0"
ruff = "^0.4"
[tool.poetry.group.test.dependencies]
pytest-cov = "^5.0"
hypothesis = "^6.100"
"""
"""
[[tool.uv.source]]
name = "private"
url = "https://pypi.example.com/simple/"
# 如果需要认证,用环境变量或 netrc 文件
# poetry 配置私有仓库
[tool.poetry.source]
name = "private"
url = "https://pypi.example.com/simple/"
priority = "supplemental"
"""
核心差异
| 维度 | Maven/Gradle | Python (uv/Poetry) |
|---|
| 依赖声明 | XML/Kotlin DSL | pyproject.toml (TOML) |
| 版本锁定 | 无(依赖传递解析) | 锁文件 (uv.lock/poetry.lock) |
| 传递依赖 | 自动(BFS 解析) | 自动(pip/uv 解析) |
| 依赖范围 | compile/test/runtime/provided | dependencies / [dependency-groups] |
| 依赖冲突 | 最近优先(Maven)/ 严格(Gradle) | 最小版本兼容 |
| 私有仓库 | settings.xml / repositories {} | [[tool.uv.source]] / netrc |
常见陷阱
try:
import boto3
HAS_BOTO3 = True
except ImportError:
HAS_BOTO3 = False
何时使用
- uv: 新项目首选,速度极快,兼容 pip 生态
- Poetry: 已有项目、需要插件生态
- pip-tools: 最简单场景、Docker 镜像构建
- 锁文件: 生产环境必须使用
12.3 代码质量
Java/Kotlin 对比
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<version>3.3.1</version>
<configuration>
<configLocation>google_checks.xml</configLocation>
</configuration>
<executions>
<execution>
<goals><goal>check</goal></goals>
<phase>verify</phase>
</execution>
</executions>
</plugin>
plugins {
id("org.jlleitschuh.gradle.ktlint") version "12.1.0"
id("io.gitlab.arturbosch.detekt") version "1.23.6"
}
ktlint {
android.set(false)
outputColorName.set("RED")
}
Python 实现
"""
[tool.ruff]
# 目标 Python 版本
target-version = "py310"
# 行长度(Black 默认 88,我们用 100)
line-length = 100
# 排除目录
exclude = [".git", ".venv", "node_modules", "*.egg-info"]
# Linter 规则
[tool.ruff.lint]
# 启用的规则集
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort(import 排序)
"N", # pep8-naming
"UP", # pyupgrade(自动升级语法到目标版本)
"B", # flake8-bugbear(常见 bug 模式)
"SIM", # flake8-simplify(简化建议)
"C4", # flake8-comprehensions(更好的推导式)
"RUF", # ruff 特有规则
]
# 忽略的规则
ignore = [
"E501", # 行太长(交给 formatter 处理)
]
# 每个文件允许的未使用 import
[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["F401"] # __init__.py 中未使用的 import 是正常的
"tests/*" = ["S101"] # 测试中允许 assert
# isort 配置
[tool.ruff.lint.isort]
known-first-party = ["my_package"]
# Formatter 配置
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
"""
"""
[tool.mypy]
python_version = "3.10"
# strict 模式: 开启所有严格检查
strict = true
# 额外配置
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = true
disallow_any_generics = true
# 第三方库的类型存根
[[tool.mypy.overrides]]
module = ["httpx.*", "structlog.*"]
ignore_missing_imports = false
"""
"""
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.4
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
args: [--maxkb=500]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.10.0
hooks:
- id: mypy
additional_dependencies: [pydantic]
entry: mypy src/
"""
def append_to(element, target=[]):
target.append(element)
return target
def append_to(element, target: list | None = None):
if target is None:
target = []
target.append(element)
return target
for _ in range(10):
pass
for i in range(10):
pass
items = [1, 2, 3]
result = [x for x in items]
result = list(items)
import os
if os.path.exists(path):
f = open(path)
with open(path) as f:
...
核心差异
| 维度 | Java/Kotlin | Python (Ruff + mypy) |
|---|
| Linter | Checkstyle / ktlint / detekt | Ruff(一个工具替代全部) |
| Formatter | Google Java Format / ktlint | Ruff format(替代 Black) |
| 类型检查 | 编译器内置 | mypy(需额外运行) |
| 静态分析 | SpotBugs / Error Prone | Ruff bugbear rules |
| 自动修复 | IDE 辅助 | Ruff --fix(命令行自动修复) |
| 提交检查 | Maven verify phase | pre-commit hooks |
| 速度 | 慢(JVM 启动) | 极快(Rust 原生) |
常见陷阱
何时使用
- Ruff: 所有 Python 项目,替代 flake8 + Black + isort + pyupgrade
- mypy strict: 生产项目,特别是多人协作
- pre-commit: 所有项目,防止不合规代码进入仓库
12.4 测试策略
Java/Kotlin 对比
import static org.junit.jupiter.api.Assertions.*;
import org.junit.jupiter.api.*;
import org.junit.jupiter.params.*;
import org.junit.jupiter.params.provider.*;
class CalculatorTest {
private Calculator calc;
@BeforeEach
void setUp() {
calc = new Calculator();
}
@Test
void testAdd() {
assertEquals(5, calc.add(2, 3));
}
@ParameterizedTest
@CsvSource({"2,3,5", "0,0,0", "-1,1,0"})
void testAddParams(int a, int b, int expected) {
assertEquals(expected, calc.add(a, b));
}
@Test
@Disabled("TODO: fix later")
void testDisabled() { }
}
import static org.mockito.Mockito.*;
class ServiceTest {
@Test
void testWithMock() {
Repository repo = mock(Repository.class);
when(repo.findById(1L)).thenReturn(Optional.of(new User("Alice")));
Service service = new Service(repo);
User result = service.getUser(1L);
assertEquals("Alice", result.getName());
verify(repo).findById(1L);
}
}
class CalculatorTest {
private lateinit var calc: Calculator
@BeforeEach
fun setUp() {
calc = Calculator()
}
@Test
fun `add two numbers`() {
assertEquals(5, calc.add(2, 3))
}
@Test
fun `mock repository`() {
val repo = mockk<Repository>()
every { repo.findById(1L) } returns User("Alice")
val service = Service(repo)
assertEquals("Alice", service.getUser(1L).name)
verify { repo.findById(1L) }
}
}
Python 实现
import pytest
from my_package.core import Calculator, UserService, UserRepository
@pytest.fixture
def calculator():
"""每个测试函数获得独立的 Calculator 实例"""
return Calculator()
@pytest.fixture
def user_service():
"""注入 mock repository 的 service"""
repo = UserRepository()
service = UserService(repo)
yield service
service.shutdown()
@pytest.fixture(scope="session")
def db_connection():
"""整个测试 session 共享一个数据库连接"""
conn = create_connection()
yield conn
conn.close()
@pytest.fixture(scope="module")
def shared_cache():
"""同一个测试文件内共享"""
return {}
import pytest
class TestCalculator:
"""测试类: 纯粹的组织手段,不需要继承任何基类"""
def test_add(self, calculator):
assert calculator.add(2, 3) == 5
assert calculator.add(-1, 1) == 0
def test_divide_by_zero(self, calculator):
with pytest.raises(ZeroDivisionError, match="division by zero"):
calculator.divide(1, 0)
def test_add_negative(self, calculator):
assert calculator.add(-5, -3) == -8
@pytest.mark.parametrize(
"a, b, expected",
[
(2, 3, 5),
(0, 0, 0),
(-1, 1, 0),
(100, 200, 300),
],
)
def test_add_parametrized(calculator, a, b, expected):
assert calculator.add(a, b) == expected
@pytest.mark.parametrize("x", [1, 2])
@pytest.mark.parametrize("y", [10, 20])
def test_multiply_matrix(calculator, x, y):
assert calculator.multiply(x, y) == x * y
@pytest.mark.slow
def test_large_computation(calculator):
import time
time.sleep(2)
assert calculator.fibonacci(100) > 0
@pytest.mark.integration
def test_database_query(user_service):
result = user_service.find_by_name("Alice")
assert result is not None
"""
[tool.pytest.ini_options]
markers = [
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"integration: marks tests as integration tests",
]
testpaths = ["tests"]
asyncio_mode = "auto"
"""
pytest fixture 深入: 依赖注入与作用域
import pytest
from dataclasses import dataclass
@pytest.fixture(scope="session")
def db_connection():
"""整个测试会话只创建一次(如数据库连接)"""
print("\n[setup] 创建数据库连接")
conn = {"connected": True}
yield conn
print("\n[teardown] 关闭数据库连接")
conn["connected"] = False
@pytest.fixture(scope="function")
def clean_db(db_connection):
"""每个测试函数执行前清理数据(依赖 db_connection)"""
db_connection["data"] = []
return db_connection
@pytest.fixture
def user_client(clean_db):
"""fixture 可以依赖其他 fixture"""
def create_user(name):
clean_db["data"].append(name)
return {"name": name, "id": len(clean_db["data"])}
return create_user
def test_create_user(user_client):
u1 = user_client("Alice")
u2 = user_client("Bob")
assert u1["id"] == 1
assert u2["id"] == 2
@pytest.fixture(params=["sqlite", "postgres", "mysql"])
def db_engine(request):
return request.param
def test_all_engines(db_engine):
assert isinstance(db_engine, str)
from unittest.mock import Mock, patch, MagicMock, call
import pytest
def test_mock_basic():
mock_repo = Mock()
mock_repo.find_by_id.return_value = {"name": "Alice"}
mock_repo.find_by_id.side_effect = lambda x: {"name": f"User-{x}"}
result = mock_repo.find_by_id(1)
assert result == {"name": "User-1"}
mock_repo.find_by_id.assert_called_once_with(1)
mock_repo.find_by_id.assert_has_calls([call(1)])
def test_with_patch():
with patch("my_package.core.requests.get") as mock_get:
mock_get.return_value.status_code = 200
mock_get.return_value.json.return_value = {"data": "hello"}
from my_package.core import fetch_data
result = fetch_data("https://api.example.com")
assert result == {"data": "hello"}
mock_get.assert_called_once_with("https://api.example.com")
@patch("my_package.core.requests.get")
def test_with_decorator(mock_get):
mock_get.return_value.status_code = 200
mock_get.return_value.json.return_value = {"data": "hello"}
from my_package.core import fetch_data
result = fetch_data("https://api.example.com")
assert result == {"data": "hello"}
@pytest.fixture
def mock_http():
with patch("my_package.core.requests.get") as mock_get:
mock_get.return_value.status_code = 200
mock_get.return_value.json.return_value = {"data": "hello"}
yield mock_get
def test_with_fixture(mock_http):
from my_package.core import fetch_data
result = fetch_data("https://api.example.com")
assert result == {"data": "hello"}
def test_mock_exception():
mock_repo = Mock()
mock_repo.find.side_effect = ValueError("not found")
with pytest.raises(ValueError, match="not found"):
mock_repo.find(1)
import asyncio
async def test_async_mock():
mock_client = Mock()
mock_client.fetch = AsyncMock(return_value={"status": "ok"})
result = await mock_client.fetch("/api")
assert result == {"status": "ok"}
import asyncio
import pytest
@pytest.fixture
async def async_client():
"""异步 fixture"""
client = await create_client()
yield client
await client.close()
@pytest.mark.asyncio
async def test_async_operation(async_client):
"""异步测试函数"""
result = await async_client.fetch("/api/users")
assert len(result) > 0
@pytest.mark.asyncio
async def test_concurrent_requests(async_client):
"""并发请求测试"""
tasks = [async_client.fetch(f"/api/users/{i}") for i in range(10)]
results = await asyncio.gather(*tasks)
assert len(results) == 10
"""
[tool.coverage.run]
source = ["my_package"]
branch = true
[tool.coverage.report]
fail_under = 80
show_missing = true
exclude_lines = [
"pragma: no cover",
"if TYPE_CHECKING:",
"raise NotImplementedError",
"if __name__ == .__main__.:",
]
"""
from hypothesis import given, strategies as st, settings
@given(st.integers(), st.integers())
def test_add_commutative(a, b):
"""加法交换律: a + b == b + a,对所有整数成立"""
assert a + b == b + a
@given(st.lists(st.integers()))
def test_sort_idempotent(lst):
"""排序是幂等的: 排序两次结果相同"""
assert sorted(sorted(lst)) == sorted(lst)
@given(st.text())
def test_reverse_twice(s):
"""反转两次等于原字符串"""
assert s == s[::-1][::-1]
@given(st.integers(min_value=0))
@settings(max_examples=200)
def test_fibonacci_monotonic(n):
"""斐波那契数列单调递增"""
from my_package.core import fibonacci
if n > 1:
assert fibonacci(n) > fibonacci(n - 1)
from hypothesis import strategies as st
UserStrategy = st.builds(
dict,
name=st.text(min_size=1, max_size=50),
age=st.integers(min_value=0, max_value=150),
email=st.emails(),
)
@given(UserStrategy)
def test_user_validation(user):
"""所有生成的用户数据都能通过 Pydantic 验证"""
from my_package.models import User
validated = User(**user)
assert validated.name == user["name"]
核心差异
| 维度 | JUnit 5 / Kotlin-test | pytest |
|---|
| 测试发现 | 注解 @Test | 自动发现 test_ 前缀函数 |
| 断言 | assertEquals/assertThrows | assert + pytest.raises |
| 生命周期 | @Before/@After | fixtures(更灵活) |
| 参数化 | @ParameterizedTest | @pytest.mark.parametrize |
| Mock | Mockito / MockK | unittest.mock |
| 异步测试 | 无原生支持 | pytest-asyncio |
| 属性测试 | jqwik | Hypothesis |
| 覆盖率 | JaCoCo | pytest-cov |
| 测试基类 | 需要继承 | 不需要继承,纯函数 |
常见陷阱
mock_repo.user.name = "Alice"
何时使用
- pytest: 所有 Python 项目,不犹豫
- pytest-asyncio: 涉及 asyncio 的项目
- Hypothesis: 数据处理、算法、验证逻辑
- pytest-cov: 持续监控覆盖率,CI 中设最低门槛
12.5 CI/CD
Java/Kotlin 对比
Python 实现
name: CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: "latest"
- name: Set up Python ${{ matrix.python-version }}
run: uv python install ${{ matrix.python-version }}
- name: Install dependencies
run: uv sync --all-extras
- name: Lint (Ruff)
run: uv run ruff check .
- name: Format check (Ruff)
run: uv run ruff format --check .
- name: Type check (mypy)
run: uv run mypy src/
- name: Run tests
run: uv run pytest --cov=my_package --cov-report=term-missing --cov-fail-under=80
- name: Security audit
run: uv run pip-audit
publish:
needs: test
if: github.ref == 'refs/heads/main' && startsWith(github.ref, 'refs/tags/')
runs-on: ubuntu-latest
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v4
- name: Build package
run: uv build
- name: Publish to PyPI
run: uv publish
# === Docker 多阶段构建 ===
# 相当于 Maven 的 Docker 构建: 多阶段编译 + 最小运行镜像
# 阶段 1: 安装依赖(利用 Docker 缓存)
FROM python:3.12-slim AS builder
WORKDIR /app
# 先复制依赖文件,利用缓存层
COPY pyproject.toml uv.lock ./
# 安装 uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# 安装依赖到虚拟环境
RUN uv sync --frozen --no-dev --no-install-project
# 复制源码并安装项目
COPY src/ ./src/
RUN uv sync --frozen --no-dev
# 阶段 2: 最小运行镜像
FROM python:3.12-slim AS runtime
WORKDIR /app
# 从 builder 复制虚拟环境
COPY --from=builder /app/.venv /app/.venv
# 复制源码
COPY --from=builder /app/src/ ./src/
# 非 root 用户运行(安全最佳实践)
RUN useradd --create-home appuser
USER appuser
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONUNBUFFERED=1 # 日志实时输出,不缓冲
EXPOSE 8000
# 入口: 用 python -m 运行
CMD ["python", "-m", "my_package"]
"""
[project.urls]
Homepage = "https://github.com/example/my-package"
Documentation = "https://my-package.readthedocs.io"
Repository = "https://github.com/example/my-package"
Changelog = "https://github.com/example/my-package/releases"
[tool.uv]
dev-dependencies = [
"twine>=5.0",
"build>=1.0",
]
"""
核心差异
| 维度 | Java CI/CD | Python CI/CD |
|---|
| 构建工具 | mvn / gradlew | uv / poetry |
| 环境管理 | SDKMAN / JDK 安装 | uv python install |
| 矩阵测试 | 多 JDK 版本 | 多 Python 版本 |
| 产物 | JAR / WAR / Docker | wheel / sdist / Docker |
| 发布目标 | Maven Central / Artifactory | PyPI / TestPyPI |
| 认证方式 | GPG 签名 + settings.xml | Trusted Publishing / API token |
| Docker 镜像 | eclipse-temurin:21-jre | python:3.12-slim |
常见陷阱
何时使用
- GitHub Actions: 最简单的 CI 方案,Python 生态首选
- 矩阵测试: 支持多版本时必须使用
- 多阶段 Docker: 生产部署必须使用
- Trusted Publishing: 发布到 PyPI 的推荐方式(无需管理 token)
12.6 安全实践
Java/Kotlin 对比
<plugin>
<groupId>org.owasp</groupId>
<artifactId>dependency-check-maven</artifactId>
<version>9.2.0</version>
<executions>
<execution>
<goals><goal>check</goal></goals>
</execution>
</executions>
</plugin>
data class UserRequest(
@field:NotBlank val name: String,
@field:Email val email: String,
@field:Min(0) val age: Int,
)
fun createUser(@Valid request: UserRequest): User { ... }
Python 实现
"""
[tool.uv]
dev-dependencies = [
"pip-audit>=2.7",
]
"""
import os
DATABASE_URL = os.environ["DATABASE_URL"]
API_KEY = os.environ.get("API_KEY", "default")
"""
DATABASE_URL=postgresql://user:pass@localhost:5432/mydb
API_KEY=dev-key-12345
DEBUG=true
"""
from dotenv import load_dotenv
load_dotenv()
DATABASE_URL = os.environ["DATABASE_URL"]
from pydantic_settings import BaseSettings, SettingsConfigDict
class AppConfig(BaseSettings):
"""类型安全的配置管理,相当于 Spring Boot 的 @ConfigurationProperties"""
database_url: str
api_key: str
debug: bool = False
max_connections: int = 10
allowed_origins: list[str] = ["*"]
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
)
config = AppConfig()
print(config.database_url)
print(config.max_connections)
class DatabaseConfig(BaseSettings):
host: str = "localhost"
port: int = 5432
name: str
model_config = SettingsConfigDict(env_prefix="DB_")
from pydantic import BaseModel, EmailStr, Field, field_validator
from datetime import date
class UserCreate(BaseModel):
"""请求体验证模型"""
name: str = Field(min_length=1, max_length=100)
email: EmailStr
age: int = Field(ge=0, le=150)
birth_date: date | None = None
password: str = Field(min_length=8, pattern=r"[A-Za-z0-9!@#$%^&*]{8,}")
@field_validator("name")
@classmethod
def name_must_not_contain_special_chars(cls, v: str) -> str:
if any(c in v for c in "!@#$%^&*"):
raise ValueError("name must not contain special characters")
return v.strip()
try:
user = UserCreate(
name="Alice",
email="alice@example.com",
age=30,
password="secure123!",
)
except Exception as e:
print(e)
import ast
user_input = "[1, 2, 3, 'hello']"
result = ast.literal_eval(user_input)
import pickle
import json
data = {"name": "Alice", "age": 30}
serialized = json.dumps(data)
deserialized = json.loads(serialized)
"""
[tool.ruff.lint]
select = [
# ... 其他规则 ...
"S", # flake8-bandit: 安全规则
]
ignore = [
"S101", # 允许 assert(测试中需要)
"S105", # 允许硬编码密码(开发环境,CI 中应启用)
]
[tool.bandit]
# Bandit: 专门的安全扫描工具
# $ bandit -r src/
skips = ["B101", "B601"]
"""
"""
repos:
- repo: https://github.com/PyCQA/bandit
rev: 1.7.8
hooks:
- id: bandit
args: ["-c", "pyproject.toml"]
additional_dependencies: ["bandit[toml]"]
"""
bandit: Python 安全漏洞扫描
vs Java: OWASP Dependency-Check / SpotBugs Security
pip install bandit
bandit -r src/
bandit -r src/ -f json -o bandit-report.json
bandit -r src/ -ll
常见检测项:
| 检测项 | 风险等级 | 说明 |
|---|
B102: exec 使用 | 高 | 代码注入风险 |
| B106: 硬编码密码 | 高 | 凭证泄露 |
| B301: pickle 使用 | 高 | 反序列化攻击 |
| B608: 硬编码 SQL | 中 | SQL 注入风险 |
| B324: hashlib 弱哈希 | 中 | MD5/SHA1 不安全 |
password = "hardcoded_password"
exec(user_input)
import pickle; pickle.loads(data)
核心差异
| 维度 | Java/Kotlin | Python |
|---|
| 依赖扫描 | OWASP Dependency-Check | pip-audit / safety |
| 密钥管理 | Vault / KMS / env | 环境变量 / pydantic-settings |
| 输入验证 | Bean Validation | Pydantic |
| 反序列化安全 | 默认安全(Java 序列化也有风险) | pickle 不安全,用 JSON |
| 代码安全扫描 | SpotBugs / FindSecBugs | Bandit |
| 密钥轮换 | Spring Cloud Config | 外部配置服务 + 重载 |
常见陷阱
import structlog
logger = structlog.get_logger()
logger.info("user login", user_id=user.id)
何时使用
- pip-audit: CI 中每次构建都运行
- pydantic-settings: 所有需要配置管理的项目
- Pydantic 验证: API 入口、配置解析、数据处理
- Bandit: CI 中定期扫描
- JSON 替代 pickle: 所有序列化场景
12.7 可观测性
Java/Kotlin 对比
<appender name="json" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdc>true</includeMdc>
</encoder>
</appender>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
import io.github.oshai.kotlinlogging.KotlinLogging
private val logger = KotlinLogging.logger {}
class UserService {
fun createUser(name: String) {
logger.info { "Creating user: $name" }
}
}
Python 实现
import structlog
import logging
import json
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
context_class=dict,
logger_factory=structlog.PrintLoggerFactory(),
)
logger = structlog.get_logger()
logger.info("user_created", user_id=42, name="Alice", email="alice@example.com")
logger.error(
"database_connection_failed",
host="db.example.com",
port=5432,
error="connection refused",
)
structlog.contextvars.bind_contextvars(
request_id="req-123",
service="user-api",
version="1.0.0",
)
logger.info("processing_request")
import logging
import json
from pythonjsonlogger import jsonlogger
class CustomJsonFormatter(jsonlogger.JsonFormatter):
def add_fields(self, log_record, record, message_dict):
super().add_fields(log_record, record, message_dict)
log_record["level"] = record.levelname
log_record["logger"] = record.name
if hasattr(record, "request_id"):
log_record["request_id"] = record.request_id
handler = logging.StreamHandler()
formatter = CustomJsonFormatter(
"%(asctime)s %(levelname)s %(message)s",
rename_fields={"asctime": "timestamp", "levelname": "level"},
)
handler.setFormatter(formatter)
root_logger = logging.getLogger()
root_logger.addHandler(handler)
root_logger.setLevel(logging.INFO)
logger = logging.getLogger("my_package")
logger.info("user_created", extra={"user_id": 42, "name": "Alice"})
from prometheus_client import Counter, Histogram, Gauge, start_http_server, Summary
import random
import time
REQUEST_COUNT = Counter(
"http_requests_total",
"Total HTTP requests",
["method", "endpoint", "status"],
)
REQUEST_LATENCY = Histogram(
"http_request_duration_seconds",
"HTTP request latency",
["method", "endpoint"],
buckets=[0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0],
)
ACTIVE_CONNECTIONS = Gauge(
"db_active_connections",
"Active database connections",
)
TASK_DURATION = Summary(
"task_duration_seconds",
"Task duration",
["task_name"],
)
def handle_request(method: str, endpoint: str):
REQUEST_COUNT.labels(method=method, endpoint=endpoint, status="200").inc()
with REQUEST_LATENCY.labels(method=method, endpoint=endpoint).time():
time.sleep(random.uniform(0.01, 0.1))
def process_task(task_name: str):
with TASK_DURATION.labels(task_name=task_name).time():
time.sleep(random.uniform(0.1, 0.5))
class BusinessMetrics:
"""业务指标封装"""
def __init__(self):
self.orders_created = Counter(
"orders_created_total",
"Total orders created",
["region", "category"],
)
self.order_value = Histogram(
"order_value_dollars",
"Order value distribution",
["region"],
buckets=[10, 50, 100, 500, 1000, 5000],
)
self.inventory_level = Gauge(
"inventory_items_remaining",
"Current inventory level",
["product_id"],
)
def record_order(self, region: str, category: str, value: float):
self.orders_created.labels(region=region, category=category).inc()
self.order_value.labels(region=region).observe(value)
def update_inventory(self, product_id: str, count: int):
self.inventory_level.labels(product_id=product_id).set(count)
metrics = BusinessMetrics()
metrics.record_order("us", "electronics", 299.99)
metrics.update_inventory("prod-123", 42)
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource, SERVICE_NAME
def setup_tracing(service_name: str = "my-service"):
resource = Resource.create({SERVICE_NAME: service_name})
provider = TracerProvider(resource=resource)
otlp_exporter = OTLPSpanExporter(endpoint="http://localhost:4317")
provider.add_span_processor(BatchSpanProcessor(otlp_exporter))
trace.set_tracer_provider(provider)
return trace.get_tracer(__name__)
tracer = setup_tracing("user-api")
def create_user(name: str, email: str) -> dict:
with tracer.start_as_current_span("create_user") as span:
span.set_attribute("user.name", name)
span.set_attribute("user.email", email)
with tracer.start_as_current_span("db.insert_user"):
user_id = save_to_database(name, email)
span.set_attribute("db.user_id", user_id)
with tracer.start_as_current_span("notification.send_welcome"):
send_welcome_email(email)
return {"id": user_id, "name": name}
from dataclasses import dataclass
from enum import Enum
class HealthStatus(str, Enum):
UP = "UP"
DOWN = "DOWN"
DEGRADED = "DEGRADED"
@dataclass
class HealthCheck:
status: HealthStatus
checks: dict[str, dict]
def to_dict(self) -> dict:
return {
"status": self.status.value,
"checks": self.checks,
}
async def check_database() -> dict:
"""检查数据库连接"""
try:
return {"status": "UP", "response_time_ms": 5}
except Exception as e:
return {"status": "DOWN", "error": str(e)}
async def check_redis() -> dict:
"""检查 Redis 连接"""
try:
return {"status": "UP", "used_memory_mb": 128}
except Exception as e:
return {"status": "DOWN", "error": str(e)}
async def check_disk_space() -> dict:
"""检查磁盘空间"""
import shutil
usage = shutil.disk_usage("/")
percent = usage.used / usage.total * 100
status = "UP" if percent < 90 else "DEGRADED" if percent < 95 else "DOWN"
return {"status": status, "usage_percent": round(percent, 1)}
async def health_check() -> HealthCheck:
"""聚合健康检查"""
import asyncio
checks = {}
overall_status = HealthStatus.UP
results = await asyncio.gather(
check_database(),
check_redis(),
check_disk_space(),
return_exceptions=True,
)
check_names = ["database", "redis", "disk_space"]
for name, result in zip(check_names, results):
if isinstance(result, Exception):
checks[name] = {"status": "DOWN", "error": str(result)}
overall_status = HealthStatus.DOWN
else:
checks[name] = result
if result["status"] == "DOWN":
overall_status = HealthStatus.DOWN
elif result["status"] == "DEGRADED" and overall_status == HealthStatus.UP:
overall_status = HealthStatus.DEGRADED
return HealthCheck(status=overall_status, checks=checks)
"""
[project]
dependencies = [
"structlog>=23.0",
"prometheus-client>=0.20",
"opentelemetry-api>=1.24",
"opentelemetry-sdk>=1.24",
"opentelemetry-exporter-otlp>=1.24",
"pydantic-settings>=2.0",
]
[project.optional-dependencies]
observability = [
"opentelemetry-instrumentation-httpx>=0.45b",
"opentelemetry-instrumentation-redis>=0.45b",
"opentelemetry-instrumentation-sqlalchemy>=0.45b",
]
"""
"""
import structlog
import logging
from prometheus_client import start_http_server
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource, SERVICE_NAME
def setup_observability(
service_name: str,
log_level: str = "INFO",
otlp_endpoint: str = "http://localhost:4317",
metrics_port: int = 8000,
) -> None:
# 1. 结构化日志
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(
getattr(logging, log_level)
),
context_class=dict,
logger_factory=structlog.PrintLoggerFactory(),
)
# 2. 分布式追踪
resource = Resource.create({SERVICE_NAME: service_name})
provider = TracerProvider(resource=resource)
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
provider.add_span_processor(BatchSpanProcessor(otlp_exporter))
trace.set_tracer_provider(provider)
# 3. Prometheus 指标端点
start_http_server(metrics_port)
"""
核心差异
| 维度 | Java/Kotlin | Python |
|---|
| 结构化日志 | Logback JSON + Kotlin Logging | structlog / python-json-logger |
| 指标收集 | Micrometer + Prometheus | prometheus-client |
| 分布式追踪 | Micrometer Tracing / Sleuth | OpenTelemetry |
| 健康检查 | Spring Boot Actuator | 自定义端点 |
| 自动 instrument | Spring 自动装配 | opentelemetry-instrumentation-* |
| 日志上下文 | MDC / ThreadLocal | structlog contextvars |
常见陷阱
何时使用
- structlog: 所有微服务、API 项目
- prometheus-client: 需要指标监控的服务
- OpenTelemetry: 分布式系统、微服务架构
- 健康检查: 所有生产部署的服务
- python-json-logger: 简单场景、已有 logging 代码的项目
本章小结
| 主题 | Java/Kotlin 工具 | Python 推荐工具 |
|---|
| 项目结构 | Maven/Gradle 标准目录 | src layout + pyproject.toml |
| 依赖管理 | Maven/Gradle | uv(首选)/ Poetry |
| 代码质量 | Checkstyle + SpotBugs + ktlint | Ruff + mypy |
| 测试 | JUnit 5 + Mockito + JaCoCo | pytest + Hypothesis + pytest-cov |
| CI/CD | Jenkins/GitHub Actions | GitHub Actions + uv |
| 安全 | OWASP + Vault | pip-audit + pydantic-settings |
| 可观测性 | Logback + Micrometer + Sleuth | structlog + prometheus-client + OTel |
核心原则: Python 的工程化工具链正在快速收敛到 pyproject.toml 作为唯一配置入口,uv 作为统一包管理器。这个趋势和 Java 生态从 Ant → Maven → Gradle 的收敛过程类似。尽早采用现代工具链,避免在过时的工具上浪费时间。