第9章: 标准库精要 — 工业开发常用模块
Java/Kotlin 开发者习惯了"万物皆需依赖管理"——JSON 解析要引 Jackson,CSV 要引 Apache Commons,命令行解析要引 JCommander,连日志都要 SLF4J + Logback 两层抽象。Python 的哲学不同:标准库即开箱即用的工具箱。本章精选工业开发中最高频的 8 个标准库模块,每个都给出 Java/Kotlin 对比和可运行的 Python demo。掌握这些,你日常 80% 的编码任务不需要 pip install 任何第三方包。
9.1 pathlib: 面向对象的路径操作
Java/Kotlin 对比
import java.nio.file.Path;
import java.nio.file.Paths;
import java.io.IOException;
import java.nio.file.Files;
Path path = Paths.get("/home/user/docs/report.txt");
Path parent = path.getParent();
String filename = path.getFileName().toString();
String stem = filename.split("\\.")[0];
String ext = filename.substring(filename.lastIndexOf('.') + 1);
Path resolved = path.resolveSibling("backup/report.txt");
boolean exists = Files.exists(path);
String content = Files.readString(path);
try (var stream = Files.list(Paths.get("/home/user/docs"))) {
stream.filter(p -> p.toString().endsWith(".txt"))
.forEach(System.out::println);
}
val path = java.nio.file.Paths.get("/home/user/docs/report.txt")
val name = path.fileName.toString()
val parent = path.parent
Python 实现
from pathlib import Path
p = Path("/home/user/docs/report.txt")
p.name
p.stem
p.suffix
p.suffixes
p.parent
p.parents
p.parts
docs = Path("/home/user/docs")
report = docs / "report.txt"
backup = docs / "backup" / "report.txt"
print(backup)
abs_path = Path("docs/report.txt").resolve()
print(abs_path)
rel_path = Path("/home/user/docs/report.txt").relative_to("/home/user")
print(rel_path)
print(docs.exists())
print(docs.is_dir())
print(docs.is_file())
for child in Path("/home/user/docs").iterdir():
print(f"{'DIR ' if child.is_dir() else 'FILE'} {child.name}")
for py_file in Path("/home/user/project").rglob("*.py"):
print(py_file)
for md_file in Path("/home/user/docs").glob("*.md"):
print(md_file)
p = Path("/tmp/demo.txt")
p.write_text("Hello, pathlib!", encoding="utf-8")
p.write_bytes(b"binary data here")
content = p.read_text(encoding="utf-8")
data = p.read_bytes()
new_path = Path("/tmp/demo_copy.txt")
p.copy2(new_path)
print(Path.home())
print(Path.cwd())
核心差异
| 特性 | Java NIO Path | Python pathlib |
|---|
| 路径拼接 | resolve() / resolveSibling() | / 运算符 |
| 获取文件名(不含扩展名) | 无直接方法,需手动 split | .stem |
| 获取扩展名 | 无直接方法,需手动 substring | .suffix |
| 读文件 | Files.readString() | .read_text() |
| 写文件 | Files.writeString() | .write_text() |
| 递归遍历 | Files.walk() | .rglob() |
| 模式匹配 | 需 PathMatcher | .glob() / .rglob() |
常见陷阱
Path("/home") / "user"
p = Path("/tmp/test.txt")
open(str(p))
f"file://{p}"
Path("docs") / "report.txt"
何时使用
- 永远优先用 pathlib,不要用
os.path。pathlib 是 Python 3.4+ 的官方推荐,API 更面向对象、更易读。
- 需要跨平台路径操作时,pathlib 是唯一选择。
- 与旧代码交互时,用
str(path) 转换。
9.2 os, sys, shutil: 系统操作
Java/Kotlin 对比
import java.lang.System;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
String home = System.getenv("HOME");
Map<String, String> env = System.getenv();
System.setProperty("app.mode", "dev");
public static void main(String[] args) { ... }
new File("/tmp/old.txt").renameTo(new File("/tmp/new.txt"));
Files.move(src, dst);
Files.delete(path);
Files.copy(src, dst);
System.exit(0);
val home = System.getenv("HOME")
Python 实现
import os
import sys
import shutil
import tempfile
home = os.environ.get("HOME")
home = os.environ["HOME"]
path = os.environ.get("PATH", "/bin")
os.environ["MY_VAR"] = "hello"
del os.environ["MY_VAR"]
for key, value in os.environ.items():
print(f"{key}={value}")
os.path.join("/home", "user", "docs")
os.path.basename("/home/user/file.txt")
os.path.dirname("/home/user/file.txt")
os.path.exists("/tmp")
os.path.isfile("/tmp/file.txt")
os.path.isdir("/tmp")
os.path.getsize("/tmp/file.txt")
print(sys.argv[0])
print(sys.argv[1:])
print(sys.path)
sys.path.append("/my/custom/modules")
sys.exit(0)
sys.exit(1)
sys.exit("error msg")
src = Path("/tmp/source.txt")
dst = Path("/tmp/dest.txt")
shutil.copy2(src, dst)
shutil.copytree("src_dir", "dst_dir")
shutil.move("old_location", "new_location")
shutil.rmtree("dir_to_delete")
usage = shutil.disk_usage("/tmp")
print(f"Total: {usage.total}, Used: {usage.used}, Free: {usage.free}")
print(os.name)
print(os.cpu_count())
print(os.getpid())
print(os.getcwd())
os.chdir("/tmp")
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f:
f.write("temporary content")
temp_path = f.name
print(f"Temp file: {temp_path}")
exit_code = os.system("ls -la")
核心差异
| 特性 | Java | Python |
|---|
| 环境变量 | System.getenv() | os.environ(字典接口) |
| 命令行参数 | main(String[] args) | sys.argv(列表) |
| 模块搜索路径 | classpath(启动时确定) | sys.path(运行时可修改) |
| 递归删除目录 | 需递归遍历 + Files.delete | shutil.rmtree() 一行搞定 |
| 递归复制目录 | 需手动实现或用第三方库 | shutil.copytree() |
| 临时文件 | Files.createTempFile() | tempfile.NamedTemporaryFile() |
常见陷阱
os.environ["MY_VAR"] = "value"
try:
sys.exit(1)
finally:
print("这行仍然会执行")
import os
target = "/tmp/some_dir"
if os.path.exists(target):
shutil.rmtree(target)
os.path.join("/home", "/absolute/path")
何时使用
- pathlib: 路径操作的首选(见 9.1)
- os.environ: 读取配置、Docker 环境变量
- sys.argv: 简单脚本获取参数(复杂场景用 argparse,见 9.6)
- sys.path: 插件系统、动态导入
- shutil: 批量文件操作(复制、移动、删除目录树)
- tempfile: 测试中创建临时文件
9.3 json, csv, tomllib: 数据序列化
Java/Kotlin 对比
import com.fasterxml.jackson.databind.ObjectMapper;
ObjectMapper mapper = new ObjectMapper();
String json = mapper.writeValueAsString(user);
mapper.writeValue(new File("user.json"), user);
User user = mapper.readValue(jsonStr, User.class);
Map<String, Object> map = mapper.readValue(jsonStr, Map.class);
@Serializable
data class User(val name: String, val age: Int)
val json = Json.encodeToString(User("Alice", 30))
val user = Json.decodeFromString<User>(json)
Python 实现
import json
import csv
import io
data = {
"name": "Alice",
"age": 30,
"skills": ["Python", "Java"],
"address": {
"city": "Beijing",
"zip": "100000"
},
"active": True,
"score": None
}
json_str = json.dumps(data, ensure_ascii=False, indent=2)
print(json_str)
parsed = json.loads(json_str)
print(type(parsed))
print(parsed["name"])
print(parsed["skills"][0])
with open("/tmp/data.json", "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False, indent=2)
with open("/tmp/data.json", "r", encoding="utf-8") as f:
loaded = json.load(f)
from datetime import datetime
def custom_serializer(obj):
if isinstance(obj, datetime):
return obj.isoformat()
raise TypeError(f"Object of type {type(obj)} is not JSON serializable")
json_str = json.dumps({"time": datetime.now()}, default=custom_serializer)
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
json_str = json.dumps({"time": datetime.now()}, cls=CustomEncoder)
rows = [
["name", "age", "city"],
["Alice", "30", "Beijing"],
["Bob", "25", "Shanghai"],
]
with open("/tmp/data.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerows(rows)
with open("/tmp/data.csv", "r", encoding="utf-8") as f:
reader = csv.reader(f)
for row in reader:
print(row)
data = [
{"name": "Alice", "age": "30", "city": "Beijing"},
{"name": "Bob", "age": "25", "city": "Shanghai"},
]
with open("/tmp/data_dict.csv", "w", newline="", encoding="utf-8") as f:
fieldnames = ["name", "age", "city"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
with open("/tmp/data_dict.csv", "r", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
print(row["name"], row["age"])
import sys
if sys.version_info >= (3, 11):
import tomllib
else:
try:
import tomllib
except ImportError:
import tomli as tomllib
toml_content = """
[database]
host = "localhost"
port = 5432
name = "myapp"
[database.pool]
min_size = 5
max_size = 20
[logging]
level = "INFO"
handlers = ["console", "file"]
[server]
hosts = ["0.0.0.0", "::"]
debug = false
"""
config = tomllib.loads(toml_content)
print(config["database"]["host"])
print(config["database"]["pool"]["min_size"])
print(config["logging"]["handlers"])
核心差异
| 特性 | Java (Jackson) | Python |
|---|
| JSON 支持 | 第三方库 | 内置 json |
| CSV 支持 | 第三方库 (Apache Commons CSV) | 内置 csv |
| TOML 支持 | 第三方库 | 内置 tomllib (3.11+) |
| 序列化自定义 | @JsonSerialize / JsonSerializer | default 参数 / 继承 JSONEncoder |
| CSV 字典模式 | 手动映射 | DictReader / DictWriter |
| 日期序列化 | @JsonFormat / JavaTimeModule | 手动 default 函数 |
常见陷阱
json.dumps({"name": "张三"})
json.dumps({"name": "张三"}, ensure_ascii=False)
data = json.loads('{"int": 42, "float": 3.14}')
print(type(data["int"]))
print(type(data["float"]))
with open("file.csv", "w", newline="") as f:
writer = csv.writer(f)
with open("config.toml", "rb") as f:
config = tomllib.load(f)
何时使用
- json: API 数据交换、配置文件、日志结构化。Python 内置,性能足够大多数场景。
- csv: 数据导入导出、Excel 兼容格式。简单场景用内置 csv,复杂场景(大数据量)考虑
pandas。
- tomllib: 读取
pyproject.toml 等配置文件。注意它只读,写 TOML 需要 tomli-w。
9.4 re: 正则表达式
Java/Kotlin 对比
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Pattern pattern = Pattern.compile("(?<name>[a-zA-Z]+)\\s+(?<age>\\d+)");
Matcher matcher = pattern.matcher("Alice 30");
if (matcher.find()) {
String name = matcher.group("name");
String age = matcher.group("age");
}
String result = Pattern.compile("\\d+").matcher("age: 30").replaceAll("N");
val regex = Regex("""(?<name>[a-zA-Z]+)\s+(?<age>\d+)""")
val match = regex.find("Alice 30")
match?.groups?.get("name")?.value
Python 实现
import re
pattern = re.compile(r"(?P<name>[a-zA-Z]+)\s+(?P<age>\d+)")
m = pattern.match("Alice 30 Bob 25")
if m:
print(m.group("name"))
print(m.group("age"))
print(m.group(0))
print(m.group(1))
print(m.groups())
print(m.groupdict())
m = re.search(r"\d+", "Alice is 30 years old")
if m:
print(m.group())
numbers = re.findall(r"\d+", "Alice 30, Bob 25, Charlie 28")
print(numbers)
pairs = re.findall(r"(\w+)\s+(\d+)", "Alice 30, Bob 25")
print(pairs)
for m in re.finditer(r"(?P<name>\w+)\s+(?P<age>\d+)", "Alice 30, Bob 25"):
print(f"{m.group('name')}: {m.group('age')}")
result = re.sub(r"\d+", "N", "Alice 30, Bob 25")
print(result)
def censor_age(match):
age = int(match.group())
return "***" if age < 18 else str(age)
result = re.sub(r"\d+", censor_age, "Alice 15, Bob 25, Charlie 12")
print(result)
result = re.sub(r"\d+", "N", "a1 b2 c3 d4", count=2)
print(result)
parts = re.split(r"[,\s]+", "Alice,30 Bob 25 Charlie,28")
print(parts)
text = "Hello\nWorld"
re.findall(r"^hello", text)
re.findall(r"^hello", text, re.IGNORECASE)
re.findall(r"^hello", text, re.I | re.MULTILINE)
pattern = re.compile(r"""
(?P<protocol>https?):// # 协议
(?P<domain>[\w.]+) # 域名
(?::(?P<port>\d+))? # 可选端口
(?P<path>/\S*)? # 可选路径
""", re.VERBOSE)
m = pattern.search("Visit https://example.com:8080/api/users")
if m:
print(m.groupdict())
log_pattern = r'(?P<timestamp>\d{4}-\d{2}-\d{2}) (?P<level>\w+) (?P<message>.+)'
log_line = '2024-01-15 ERROR Database connection failed'
match = re.match(log_pattern, log_line)
if match:
print(f"时间: {match.group('timestamp')}")
print(f"级别: {match.group('level')}")
print(f"消息: {match.group('message')}")
print(f"时间(v2): {match['timestamp']}")
result = re.sub(
r'(?P<name>\w+)=(?P<value>\w+)',
lambda m: f"{m['name'].upper()}={m['value']}",
'name=Alice age=30 city=Beijing'
)
print(f"替换结果: {result}")
核心差异
| 特性 | Java Matcher | Python re |
|---|
| 命名捕获组语法 | (?<name>...) | (?P<name>...) |
| 不需要双反斜杠 | 不支持(需要 "\\d+") | raw string: r"\d+" |
| 查找所有匹配 | 需 while(matcher.find()) 循环 | findall() / finditer() |
| 替换 | replaceAll() / appendReplacement() | sub() 支持函数回调 |
| 可读正则 | 不支持 | re.VERBOSE 支持注释 |
| 编译 | Pattern.compile() | re.compile() |
常见陷阱
re.match(r"\d+", "abc 123")
re.search(r"\d+", "abc 123")
re.findall(r"\d+", "a1 b2")
re.findall(r"(\d+)", "a1 b2")
re.findall(r"(\d)(\d+)", "a12 b34")
re.findall(r"<.+>", "<a> <b>")
re.findall(r"<.+?>", "<a> <b>")
何时使用
- 简单字符串操作: 优先用
str.split(), str.replace(), str.startswith() 等字符串方法,正则表达式是最后手段。
- 复杂模式匹配: 邮箱、URL、IP 地址等需要正则。
- 数据清洗: 从非结构化文本中提取结构化数据。
- 性能敏感: 预编译正则
re.compile(),避免重复编译。
9.5 logging: 结构化日志
Java/Kotlin 对比
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class MyApp {
private static final Logger log = LoggerFactory.getLogger(MyApp.class);
public void doWork() {
log.debug("Processing item: {}", itemId);
log.info("User {} logged in", username);
log.warn("Retry attempt {}/{}", attempt, maxAttempts);
log.error("Failed to connect", exception);
}
}
import io.github.microutils.kotlinlogging.KotlinLogging
private val logger = KotlinLogging.logger {}
Python 实现
import logging
import logging.config
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s [%(levelname)s] %(name)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger(__name__)
logger.debug("调试信息 - 变量值: x=%d", 42)
logger.info("普通信息 - 服务启动完成")
logger.warning("警告 - 磁盘空间不足 90%")
logger.error("错误 - 数据库连接失败")
logger.critical("严重 - 系统无法启动")
try:
1 / 0
except ZeroDivisionError:
logger.error("计算出错", exc_info=True)
logger.info("User %s logged in from %s", "Alice", "192.168.1.1")
LOGGING_CONFIG = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"standard": {
"format": "%(asctime)s [%(levelname)s] %(name)s:%(lineno)d - %(message)s",
"datefmt": "%Y-%m-%d %H:%M:%S",
},
"json": {
"format": '{"time":"%(asctime)s","level":"%(levelname)s","logger":"%(name)s","msg":"%(message)s"}',
},
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": "DEBUG",
"formatter": "standard",
"stream": "ext://sys.stdout",
},
"file": {
"class": "logging.handlers.RotatingFileHandler",
"level": "INFO",
"formatter": "standard",
"filename": "/tmp/app.log",
"maxBytes": 10 * 1024 * 1024,
"backupCount": 5,
"encoding": "utf-8",
},
},
"loggers": {
"urllib3": {"level": "WARNING"},
"requests": {"level": "WARNING"},
"myapp": {"level": "DEBUG", "handlers": ["console", "file"], "propagate": False},
},
"root": {
"level": "WARNING",
"handlers": ["console"],
},
}
logging.config.dictConfig(LOGGING_CONFIG)
app_logger = logging.getLogger("myapp.service")
app_logger.info("服务启动")
app_logger.debug("调试详情")
核心差异
| 特性 | SLF4J/Logback | Python logging |
|---|
| Logger 命名 | 类全限定名 | 模块名 __name__ |
| 延迟求值 | {} 占位符 | %s 占位符 |
| 配置方式 | XML (logback.xml) | dictConfig (Python 字典) / basicConfig |
| 日志级别 | TRACE < DEBUG < INFO < WARN < ERROR | DEBUG < INFO < WARNING < ERROR < CRITICAL |
| 异常日志 | log.error("msg", exception) | logger.error("msg", exc_info=True) 或 logger.exception() |
| 日志轮转 | <rollingPolicy> | RotatingFileHandler / TimedRotatingFileHandler |
| 过滤第三方日志 | <logger level="WARN"> | loggers 字典配置 |
常见陷阱
logger.debug(f"user: {user}")
logger.debug("user: %s", user)
logger = logging.getLogger(__name__)
def bad():
logger = logging.getLogger(__name__)
structlog: 现代结构化日志
工业级项目推荐使用 structlog 替代标准 logging。详见 12.7 可观测性。
import structlog
logger = structlog.get_logger()
logger.info("user_login", user_id=42, ip="192.168.1.1")
structlog.configure(
processors=[
structlog.processors.add_log_level,
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
)
何时使用
- 脚本/工具:
basicConfig 足够,一行配置搞定。
- 正式项目/服务:
dictConfig 是唯一推荐,支持 Handler、Formatter、Logger 层级、日志轮转。
- JSON 日志: 生产环境推荐 JSON 格式,方便 ELK/Grafana Loki 采集。可用
python-json-logger 库。
- 不要 print(): 永远不要用
print() 做日志输出,它没有级别、没有时间戳、无法被日志系统采集。
9.6 argparse: 命令行解析
Java/Kotlin 对比
import com.beust.jcommander.Parameter;
import com.beust.jcommander.Parameters;
@Parameters(commandDescription = "Database migration tool")
public class MigrateCommand {
@Parameter(names = {"--host"}, description = "Database host", required = true)
private String host;
@Parameter(names = {"--port"}, description = "Database port")
private int port = 5432;
@Parameter(names = {"--dry-run"}, description = "Dry run mode")
private boolean dryRun = false;
}
Python 实现
import argparse
def main():
parser = argparse.ArgumentParser(
description="数据库迁移工具",
epilog="示例: python migrate.py --host localhost --port 5432 up",
)
parser.add_argument(
"command",
choices=["up", "down", "status"],
help="迁移命令: up | down | status",
)
parser.add_argument(
"--host", "-h",
default="localhost",
help="数据库主机地址 (默认: localhost)",
)
parser.add_argument(
"--port", "-p",
type=int,
default=5432,
help="数据库端口 (默认: 5432)",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="试运行模式,不实际执行",
)
parser.add_argument(
"--verbose", "-v",
action="count",
default=0,
help="详细程度: -v, -vv, -vvv",
)
parser.add_argument(
"--tables",
nargs="+",
help="要迁移的表名(空格分隔)",
)
args = parser.parse_args()
print(f"命令: {args.command}")
print(f"主机: {args.host}")
print(f"端口: {args.port}")
print(f"试运行: {args.dry_run}")
print(f"详细级别: {args.verbose}")
print(f"表: {args.tables}")
def create_parser():
parser = argparse.ArgumentParser(description="项目管理工具")
subparsers = parser.add_subparsers(dest="subcommand", help="子命令")
build_parser = subparsers.add_parser("build", help="构建项目")
build_parser.add_argument("--release", action="store_true", help="发布构建")
build_parser.add_argument("--target", choices=["web", "mobile", "desktop"], default="web")
deploy_parser = subparsers.add_parser("deploy", help="部署项目")
deploy_parser.add_argument("environment", choices=["dev", "staging", "prod"])
deploy_parser.add_argument("--skip-tests", action="store_true")
deploy_parser.add_argument("--version", required=True, help="部署版本号")
return parser
parser = create_parser()
args = parser.parse_args(["deploy", "prod", "--version", "1.2.3", "--skip-tests"])
核心差异
| 特性 | JCommander (Java) | argparse (Python) |
|---|
| 依赖 | 第三方库 | 内置标准库 |
| 参数定义 | 注解 @Parameter | add_argument() 调用 |
| 布尔开关 | boolean = true 属性 | action="store_true" |
| 类型转换 | 自动(基于字段类型) | type=int/float/... 参数 |
| 子命令 | @Parameters 嵌套 | add_subparsers() |
| 自动帮助 | @Parameter(help=...) | help= 参数,自动生成 --help |
| 必填参数 | required = true | required=True |
常见陷阱
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args(["--host", "localhost", "up"])
group = parser.add_mutually_exclusive_group()
group.add_argument("--verbose", action="store_true")
group.add_argument("--quiet", action="store_true")
何时使用
- 任何需要命令行参数的脚本: argparse 是 Python 标准答案,不需要第三方库。
- 简单脚本: 1-2 个参数,直接
add_argument。
- 复杂 CLI 工具: 子命令 + 互斥参数 + 类型转换。
- 替代方案:
click(第三方库,装饰器风格更优雅)、typer(基于 click,类型提示驱动)。
9.7 sqlite3: 内置数据库
Java/Kotlin 对比
import java.sql.*;
String url = "jdbc:sqlite:/tmp/myapp.db";
Connection conn = DriverManager.getConnection(url);
Statement stmt = conn.createStatement();
stmt.executeUpdate("CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, name TEXT)");
PreparedStatement pstmt = conn.prepareStatement("INSERT INTO users (name, age) VALUES (?, ?)");
pstmt.setString(1, "Alice");
pstmt.setInt(2, 30);
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT * FROM users");
while (rs.next()) {
String name = rs.getString("name");
int age = rs.getInt("age");
}
conn.setAutoCommit(false);
try {
conn.commit();
} catch (SQLException e) {
conn.rollback();
}
try (Connection c = DriverManager.getConnection(url);
Statement s = c.createStatement()) {
}
Python 实现
import sqlite3
conn = sqlite3.connect("/tmp/myapp.db")
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
age INTEGER,
email TEXT UNIQUE
)
""")
cursor.execute(
"INSERT INTO users (name, age, email) VALUES (?, ?, ?)",
("Alice", 30, "alice@example.com")
)
cursor.execute(
"INSERT INTO users (name, age, email) VALUES (?, ?, ?)",
("Bob", 25, "bob@example.com")
)
users = [
("Charlie", 28, "charlie@example.com"),
("Diana", 35, "diana@example.com"),
("Eve", 22, "eve@example.com"),
]
cursor.executemany(
"INSERT INTO users (name, age, email) VALUES (?, ?, ?)",
users
)
conn.commit()
cursor.execute("SELECT * FROM users WHERE name = ?", ("Alice",))
row = cursor.fetchone()
if row:
print(row["name"])
print(row["age"])
print(row["id"])
print(dict(row))
cursor.execute("SELECT name, age FROM users WHERE age > ?", (25,))
rows = cursor.fetchall()
for row in rows:
print(f"{row['name']}: {row['age']}")
cursor.execute("SELECT * FROM users")
for row in cursor:
print(row["name"], row["age"])
try:
cursor.execute("UPDATE users SET age = ? WHERE name = ?", (31, "Alice"))
cursor.execute("UPDATE users SET age = ? WHERE name = ?", (26, "Bob"))
conn.commit()
except sqlite3.Error as e:
conn.rollback()
print(f"事务失败: {e}")
with conn:
conn.execute("UPDATE users SET age = age + 1 WHERE name = ?", ("Alice",))
conn.close()
def query_users(db_path: str, min_age: int) -> list[dict]:
"""查询年龄大于 min_age 的用户"""
with sqlite3.connect(db_path) as conn:
conn.row_factory = sqlite3.Row
cursor = conn.execute(
"SELECT name, age, email FROM users WHERE age > ? ORDER BY age",
(min_age,)
)
return [dict(row) for row in cursor]
results = query_users("/tmp/myapp.db", 25)
for user in results:
print(user)
核心差异
| 特性 | JDBC (Java) | sqlite3 (Python) |
|---|
| 依赖 | 第三方 JDBC 驱动 | 内置标准库 |
| 连接 | DriverManager.getConnection() | sqlite3.connect() |
| 参数占位符 | ? 或 :name | ?(仅支持 ?) |
| 结果集 | ResultSet(需 while(rs.next())) | fetchone() / fetchall() / 迭代 |
| 字典访问 | rs.getString("name") | row["name"](需设置 row_factory) |
| 资源关闭 | try-with-resources | with 上下文管理器 |
| 事务 | setAutoCommit(false) + commit/rollback | with conn: 自动管理 |
常见陷阱
cursor.execute("INSERT INTO users (name) VALUES (?)", ("Alice",))
conn.commit()
name = "Alice"
cursor.execute(f"SELECT * FROM users WHERE name = '{name}'")
cursor.execute("SELECT * FROM users WHERE name = ?", (name,))
何时使用
- 原型开发/测试: SQLite 零配置,适合快速验证。
- 嵌入式应用: 桌面应用、移动应用、CLI 工具的本地存储。
- 小型 Web 应用: 读多写少、单机部署的场景。
- 不适合: 高并发写入、需要水平扩展的场景(用 PostgreSQL/MySQL)。
9.8 datetime, zoneinfo: 日期时间
Java/Kotlin 对比
import java.time.*;
import java.time.format.DateTimeFormatter;
LocalDateTime now = LocalDateTime.now();
LocalDate today = LocalDate.now();
ZonedDateTime beijing = ZonedDateTime.now(ZoneId.of("Asia/Shanghai"));
LocalDate date = LocalDate.of(2025, 1, 15);
LocalTime time = LocalTime.of(10, 30, 0);
LocalDateTime dt = LocalDateTime.of(date, time);
ZonedDateTime tokyo = beijing.withZoneSameInstant(ZoneId.of("Asia/Tokyo"));
DateTimeFormatter fmt = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
String formatted = now.format(fmt);
LocalDateTime parsed = LocalDateTime.parse("2025-01-15 10:30:00", fmt);
Duration d = Duration.between(start, end);
Period p = Period.between(date1, date2);
import kotlinx.datetime.*
import kotlinx.datetime.Clock.System.now
val now: Instant = Clock.System.now()
val localDt = now.toLocalDateTime(TimeZone.of("Asia/Shanghai"))
Python 实现
from datetime import datetime, date, time, timedelta, timezone
from zoneinfo import ZoneInfo
now_naive = datetime.now()
print(now_naive)
now_utc = datetime.now(timezone.utc)
print(now_utc)
now_beijing = datetime.now(ZoneInfo("Asia/Shanghai"))
print(now_beijing)
today = date.today()
current_time = datetime.now().time()
d = date(2025, 1, 15)
t = time(10, 30, 0)
dt = datetime(2025, 1, 15, 10, 30, 0)
dt_aware = datetime(2025, 1, 15, 10, 30, 0, tzinfo=ZoneInfo("Asia/Shanghai"))
dt_from_ts = datetime.fromtimestamp(1705282200, tz=ZoneInfo("Asia/Shanghai"))
now = datetime.now(ZoneInfo("Asia/Shanghai"))
now.year
now.month
now.day
now.hour
now.minute
now.second
now.microsecond
now.weekday()
now.isoweekday()
now.isoformat()
delta = timedelta(days=7, hours=3, minutes=30)
print(delta)
tomorrow = date.today() + timedelta(days=1)
yesterday = date.today() - timedelta(days=1)
next_week = datetime.now() + timedelta(weeks=1)
two_hours_later = datetime.now() + timedelta(hours=2)
d1 = date(2025, 1, 15)
d2 = date(2025, 3, 20)
diff = d2 - d1
print(diff.days)
dt1 = datetime(2025, 1, 15, 10, 0)
dt2 = datetime(2025, 1, 15, 14, 30)
diff = dt2 - dt1
print(diff.total_seconds())
beijing = datetime(2025, 1, 15, 10, 30, tzinfo=ZoneInfo("Asia/Shanghai"))
tokyo = beijing.astimezone(ZoneInfo("Asia/Tokyo"))
print(tokyo)
new_york = beijing.astimezone(ZoneInfo("America/New_York"))
print(new_york)
utc = beijing.astimezone(timezone.utc)
print(utc)
now = datetime.now(ZoneInfo("Asia/Shanghai"))
print(now.strftime("%Y-%m-%d %H:%M:%S"))
print(now.strftime("%Y年%m月%d日"))
print(now.strftime("%A, %B %d, %Y"))
print(now.strftime("%Y%m%dT%H%M%SZ"))
s = "2025-01-15 10:30:00"
parsed = datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
print(parsed)
s = "2025-01-15 10:30:00+08:00"
parsed = datetime.strptime(s, "%Y-%m-%d %H:%M:%S%z")
print(parsed)
iso_str = now.isoformat()
parsed = datetime.fromisoformat(iso_str)
import time
ts = datetime.now(timezone.utc).timestamp()
dt = datetime.fromtimestamp(ts, tz=ZoneInfo("Asia/Shanghai"))
print(time.time())
import calendar
def is_workday(d: date) -> bool:
return d.weekday() < 5
_, days_in_month = calendar.monthrange(2025, 2)
_, days_in_month = calendar.monthrange(2024, 2)
def calculate_age(birth_date: date) -> int:
today = date.today()
return today.year - birth_date.year - (
(today.month, today.day) < (birth_date.month, birth_date.day)
)
age = calculate_age(date(1990, 6, 15))
print(age)
核心差异
| 特性 | Java Time API | Python datetime |
|---|
| 当前时间 | LocalDateTime.now() | datetime.now() |
| 时区支持 | ZonedDateTime | datetime + ZoneInfo |
| 时区转换 | .withZoneSameInstant() | .astimezone() |
| 时间差 | Duration / Period | timedelta(统一) |
| 格式化 | DateTimeFormatter | strftime() |
| 解析 | LocalDateTime.parse() | strptime() / fromisoformat() |
| ISO 格式 | .toString() | .isoformat() / fromisoformat() |
| naive vs aware | 编译期区分类型 | 运行时属性 tzinfo |
常见陷阱
naive = datetime(2025, 1, 15, 10, 0)
aware = datetime(2025, 1, 15, 10, 0, tzinfo=ZoneInfo("Asia/Shanghai"))
naive = naive.replace(tzinfo=ZoneInfo("Asia/Shanghai"))
datetime.now()
datetime.now(timezone.utc)
datetime.now(ZoneInfo("Asia/Shanghai"))
from dateutil.relativedelta import relativedelta
next_month = date.today() + relativedelta(months=1)
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
print(f"UTC now: {now.isoformat()}")
naive = datetime(2024, 1, 1)
aware = datetime(2024, 1, 1, tzinfo=timezone.utc)
何时使用
- datetime: 所有日期时间操作的基础。服务端代码永远用 aware datetime(带时区)。
- date: 只需要日期(生日、截止日期等)。
- timedelta: 时间加减、计算间隔。
- ZoneInfo (3.9+): 时区处理,替代已废弃的
pytz。
- time.time(): 需要高精度时间戳时(性能测量、缓存过期)。
- 第三方库: 复杂日期逻辑(工作日计算、 recurrence rule)用
python-dateutil;大量日期数据处理用 pendulum 或 arrow。
总结: 标准库选型速查
| 场景 | Python 标准库 | Java/Kotlin 对应 |
|---|
| 路径操作 | pathlib | java.nio.file.Path |
| 环境变量 | os.environ | System.getenv() |
| 命令行参数 | argparse | JCommander / picocli |
| JSON | json | Jackson / Gson |
| CSV | csv | Apache Commons CSV |
| TOML | tomllib (3.11+) | 无标准方案 |
| 正则表达式 | re | java.util.regex |
| 日志 | logging | SLF4J / Logback |
| 嵌入式数据库 | sqlite3 | JDBC + SQLite 驱动 |
| 日期时间 | datetime + zoneinfo | java.time |
| 文件复制/删除 | shutil | Files.copy/move/delete |
| 临时文件 | tempfile | Files.createTempFile() |
核心原则: Python 标准库覆盖了工业开发 80% 的需求。先查标准库,再考虑第三方包。pip install 之前,先问自己:标准库能不能做?