文件IO、JSON、正则、日期时间:Java API vs Python标准库
摘要:日常开发中最常用的工具类,Java和Python都有良好的支持,但API设计风格迥异。
写在前面
作为Java工程师,你一定对Files.readString()、ObjectMapper、正则的Pattern/Matcher、日期的LocalDateTime等了如指掌。Python的同名功能更加简洁,但需要适应不同的API风格。
这篇文章帮你快速建立映射关系,做到"看到Java的API就能想到Python的实现"。
一、文件IO对比
1.1 读取文件
String content = Files.readString(Path.of("file.txt"));
List<String> lines = Files.readAllLines(Path.of("file.txt"));
try (BufferedReader reader = Files.newBufferedReader(Path.of("file.txt"))) {
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
with open("file.txt", "r", encoding="utf-8") as f:
content = f.read()
with open("file.txt", "r", encoding="utf-8") as f:
lines = f.readlines()
with open("file.txt", "r", encoding="utf-8") as f:
for line in f:
print(line)
1.2 写入文件
Files.writeString(Path.of("output.txt"), "Hello",
StandardOpenOption.CREATE, StandardOpenOption.APPEND);
Files.write(Path.of("output.txt"), List.of("line1", "line2"));
with open("output.txt", "w", encoding="utf-8") as f:
f.write("Hello\n")
with open("output.txt", "w", encoding="utf-8") as f:
f.writelines(["line1\n", "line2\n"])
1.3 文件操作对比表
| 操作 | Java | Python |
|---|
| 读取全部 | Files.readString() | open().read() |
| 逐行读取 | Files.readAllLines() | for line in f: |
| 写入 | Files.writeString() | open().write() |
| 复制 | Files.copy(src, dst) | shutil.copy() |
| 移动 | Files.move(src, dst) | shutil.move() |
| 删除 | Files.delete() | os.remove() |
| 存在检查 | Files.exists() | os.path.exists() |
| 创建目录 | Files.createDirectory() | os.makedirs() |
| 遍历目录 | Files.list() | os.listdir() / Path.glob() |
1.4 Path对象
Path path = Path.of("dir", "subdir", "file.txt");
path.getParent();
path.getFileName();
path.getRoot();
path.resolve("other.txt");
path.relativize(otherPath);
from pathlib import Path
path = Path("dir") / "subdir" / "file.txt"
path.parent
path.name
path.suffix
path.exists()
path.is_file()
path.is_dir()
path.iterdir()
path.glob("*.txt")
二、JSON处理对比
2.1 Java的Jackson/ObjectMapper
ObjectMapper mapper = new ObjectMapper();
String json = mapper.writeValueAsString(user);
User user = mapper.readValue(json, User.class);
User user = mapper.readValue(Path.of("user.json"), User.class);
Map<String, Object> map = mapper.readValue(json,
new TypeReference<Map<String, Object>>() {});
2.2 Python的json模块
import json
json_str = json.dumps(user)
user = json.loads(json_str)
with open("user.json") as f:
user = json.load(f)
with open("user.json", "w") as f:
json.dump(user, f, indent=2)
json_str = json.dumps(data, indent=2, ensure_ascii=False)
2.3 数据类序列化
@JsonInclude(JsonInclude.Include.NON_NULL)
public class User {
private String name;
private int age;
}
ObjectMapper mapper = new ObjectMapper();
mapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
from dataclasses import asdict
from dataclasses import dataclass, field
@dataclass
class User:
name: str
age: int = 0
user = User("Alice", 30)
json_str = json.dumps(asdict(user))
2.4 JSON处理对比表
| 操作 | Java (Jackson) | Python (json) |
|---|
| 序列化 | writeValueAsString(obj) | dumps(obj) |
| 反序列化 | readValue(json, Class) | loads(json_str) |
| 格式化 | writerWithDefaultPrettyPrinter() | dumps(obj, indent=2) |
| 日期格式化 | @JsonFormat | default参数 |
| 忽略null | setSerializationInclusion | default参数 |
三、正则表达式对比
3.1 基本使用
Pattern pattern = Pattern.compile("^\\d{4}-\\d{2}-\\d{2}$");
Matcher matcher = pattern.matcher("2024-01-15");
if (matcher.matches()) {
String date = matcher.group(0);
}
String result = "hello world".replaceAll("world", "python");
String[] parts = "a,b,c".split(",");
import re
pattern = re.compile(r"^\d{4}-\d{2}-\d{2}$")
if pattern.match("2024-01-15"):
print("匹配")
if re.match(r"^\d{4}-\d{2}-\d{2}$", "2024-01-15"):
print("匹配")
result = re.sub(r"world", "python", "hello world")
parts = re.split(r",", "a,b,c")
3.2 常用方法对比
| 操作 | Java | Python |
|---|
| 编译 | Pattern.compile() | re.compile() |
| 完全匹配 | matcher.matches() | re.fullmatch() |
| 开头匹配 | matcher.lookingAt() | re.match() |
| 搜索 | matcher.find() | re.search() |
| 全部搜索 | matcher.findAll() | re.findall() |
| 替换 | matcher.replaceAll() | re.sub() |
3.3 捕获组
Pattern p = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher m = p.matcher("2024-01-15");
if (m.matches()) {
String year = m.group(1);
String month = m.group(2);
String day = m.group(3);
}
Pattern p = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})");
Matcher m = p.matcher("2024-01");
m.matches();
String year = m.group("year");
match = re.match(r"(\d{4})-(\d{2})-(\d{2})", "2024-01-15")
if match:
year, month, day = match.groups()
match = re.match(r"(?P<year>\d{4})-(?P<month>\d{2})", "2024-01")
year = match.group("year")
3.4 非捕获组
Pattern p = Pattern.compile("(?:\\d{4})-(?:\\d{2})");
pattern = re.compile(r"(?:\d{4})-(?:\d{2})")
四、日期时间对比
4.1 Java的Date/Time API
LocalDate date = LocalDate.now();
LocalTime time = LocalTime.now();
LocalDateTime datetime = LocalDateTime.now();
ZonedDateTime zdt = ZonedDateTime.now();
LocalDate parsed = LocalDate.parse("2024-01-15");
LocalDateTime parsed2 = LocalDateTime.parse("2024-01-15T10:30:00");
date.format(DateTimeFormatter.ofPattern("yyyy-MM-dd"));
datetime.format(DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"));
date.plusDays(7);
date.minusMonths(1);
datetime.plusHours(2);
long days = ChronoUnit.DAYS.between(start, end);
4.2 Python的datetime
from datetime import datetime, date, time, timedelta
from datetime import timezone
now = datetime.now()
today = date.today()
parsed = datetime.fromisoformat("2024-01-15")
parsed2 = datetime.strptime("2024-01-15 10:30:00", "%Y-%m-%d %H:%M:%S")
now.strftime("%Y-%m-%d")
now.strftime("%Y-%m-%d %H:%M:%S")
later = now + timedelta(days=7)
earlier = now - timedelta(hours=2)
from datetime import timezone
utc = datetime.now(timezone.utc)
4.3 日期时间对比表
| 操作 | Java | Python |
|---|
| 当前日期 | LocalDate.now() | date.today() |
| 当前时间 | LocalTime.now() | datetime.now().time() |
| 当前日期时间 | LocalDateTime.now() | datetime.now() |
| 解析字符串 | LocalDate.parse() | datetime.strptime() / fromisoformat() |
| 格式化 | .format(formatter) | .strftime(format) |
| 加减日期 | .plusDays() | + timedelta() |
| 日期差 | ChronoUnit.DAYS.between() | - timedelta |
| 时区 | ZonedDateTime | pytz / zoneinfo |
4.4 时区处理
ZonedDateTime tokyo = ZonedDateTime.now(ZoneId.of("Asia/Tokyo"));
ZonedDateTime converted = tokyo.withZoneSameInstant(ZoneId.of("America/New_York"));
from datetime import datetime
from zoneinfo import ZoneInfo
tokyo = datetime.now(ZoneInfo("Asia/Tokyo"))
converted = tokyo.astimezone(ZoneInfo("America/New_York"))
五、字符串处理对比
5.1 常用方法
String s = " Hello, World! ";
s.trim();
s.toLowerCase();
s.toUpperCase();
s.contains("World");
s.startsWith(" H");
s.endsWith("! ");
s.indexOf("o");
s.substring(2, 7);
s.replace("World", "Python");
s.split(",");
s.isEmpty();
s.isBlank();
s = " Hello, World! "
s.strip()
s.lower()
s.upper()
s.startswith(" H")
s.endswith("! ")
s.find("o")
s.index("o")
s[2:7]
s.replace("World", "Python")
s.split(",")
s.isdigit()
s.isalnum()
5.2 字符串拼接
String s1 = "Hello" + " " + "World";
String s2 = String.join("-", "a", "b", "c");
String s3 = String.format("Name: %s, Age: %d", "Alice", 30);
String json = """
{
"name": "Alice",
"age": 30
}
""";
s1 = "Hello" + " " + "World"
s2 = "-".join(["a", "b", "c"])
s3 = f"Name: {'Alice'}, Age: {30}"
name = "Alice"
age = 30
s4 = f"Name: {name}, Age: {age}"
json = """
{
"name": "Alice",
"age": 30
}
"""
5.3 格式化对比
| 格式 | Java | Python |
|---|
| 字符串 | %s | %s |
| 整数 | %d | %d |
| 浮点数 | %f | %f |
| 宽度 | %10s | %10s |
| 保留小数 | %.2f | %.2f |
| 千分位 | %,d | :,d |
六、集合转换
6.1 Stream vs 列表推导式
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> result = numbers.stream()
.filter(n -> n % 2 == 0)
.map(n -> n * n)
.collect(Collectors.toList());
int sum = numbers.stream().mapToInt(Integer::intValue).sum();
Map<String, List<Person>> byCity = people.stream()
.collect(Collectors.groupingBy(Person::getCity));
numbers = [1, 2, 3, 4, 5]
result = [n**2 for n in numbers if n % 2 == 0]
total = sum(numbers)
from itertools import groupby
by_city = {k: list(v) for k, v in groupby(sorted(people, key=lambda p: p.city), key=lambda p: p.city)}
import pandas as pd
df = pd.DataFrame(people)
by_city = df.groupby("city").apply(lambda x: x.to_dict("records"))
七、实战:读取JSON文件并处理
7.1 Java实现
ObjectMapper mapper = new ObjectMapper();
mapper.registerModule(new JavaTimeModule());
Path filePath = Path.of("data.json");
List<User> users = mapper.readValue(filePath,
mapper.getTypeFactory().constructCollectionType(List.class, User.class));
List<String> names = users.stream()
.filter(u -> u.getAge() > 20)
.map(User::getName)
.sorted()
.collect(Collectors.toList());
String output = mapper.writeValueAsString(names);
Files.writeString(Path.of("output.json"), output);
7.2 Python实现
import json
from datetime import date
with open("data.json") as f:
users = json.load(f)
names = sorted([
u["name"] for u in users
if u["age"] > 20
])
with open("output.json", "w") as f:
json.dump(names, f, indent=2)
八、总结
| 类别 | Java风格 | Python风格 |
|---|
| 文件读取 | Files.readString() | open().read() |
| JSON | ObjectMapper | json.dumps() |
| 正则 | Pattern/Matcher | re.search/sub |
| 日期 | LocalDateTime | datetime |
| 格式化 | String.format() | f-string |
| 列表处理 | Stream API | 列表推导式 |
Python标准库的设计理念是"简单场景简单写"。Java倾向于提供完整但复杂的API,Python则提供简洁但需要组合使用的API。掌握Python的pathlib、json、re、datetime四大模块,足以应对日常开发的80%场景。