07-Java工程师的Python第七课-常用库对比

3 阅读6分钟

文件IO、JSON、正则、日期时间:Java API vs Python标准库

摘要:日常开发中最常用的工具类,Java和Python都有良好的支持,但API设计风格迥异。


写在前面

作为Java工程师,你一定对Files.readString()ObjectMapper、正则的Pattern/Matcher、日期的LocalDateTime等了如指掌。Python的同名功能更加简洁,但需要适应不同的API风格。

这篇文章帮你快速建立映射关系,做到"看到Java的API就能想到Python的实现"。


一、文件IO对比

1.1 读取文件

// Java - NIO方式(Java 11+)
String content = Files.readString(Path.of("file.txt"));

// 逐行读取
List<String> lines = Files.readAllLines(Path.of("file.txt"));

// Java 7+ 传统方式
try (BufferedReader reader = Files.newBufferedReader(Path.of("file.txt"))) {
    String line;
    while ((line = reader.readLine()) != null) {
        System.out.println(line);
    }
}
# Python - 最简洁方式
with open("file.txt", "r", encoding="utf-8") as f:
    content = f.read()

# 逐行读取
with open("file.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()

# 更Pythonic的方式
with open("file.txt", "r", encoding="utf-8") as f:
    for line in f:  # f本身就是迭代器
        print(line)

1.2 写入文件

// Java - 追加模式
Files.writeString(Path.of("output.txt"), "Hello",
    StandardOpenOption.CREATE, StandardOpenOption.APPEND);

// 写入多行
Files.write(Path.of("output.txt"), List.of("line1", "line2"));
# Python
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("Hello\n")

# 写入多行
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(["line1\n", "line2\n"])

1.3 文件操作对比表

操作JavaPython
读取全部Files.readString()open().read()
逐行读取Files.readAllLines()for line in f:
写入Files.writeString()open().write()
复制Files.copy(src, dst)shutil.copy()
移动Files.move(src, dst)shutil.move()
删除Files.delete()os.remove()
存在检查Files.exists()os.path.exists()
创建目录Files.createDirectory()os.makedirs()
遍历目录Files.list()os.listdir() / Path.glob()

1.4 Path对象

// Java NIO Path
Path path = Path.of("dir", "subdir", "file.txt");
path.getParent();
path.getFileName();
path.getRoot();
path.resolve("other.txt");
path.relativize(otherPath);
# Python Path(3.4+)
from pathlib import Path

path = Path("dir") / "subdir" / "file.txt"
path.parent
path.name
path.suffix
path.exists()
path.is_file()
path.is_dir()
path.iterdir()  # 遍历目录
path.glob("*.txt")  # 通配符匹配

二、JSON处理对比

2.1 Java的Jackson/ObjectMapper

// Maven依赖
// <artifactId>jackson-databind</artifactId>

ObjectMapper mapper = new ObjectMapper();

// Java对象 -> JSON字符串
String json = mapper.writeValueAsString(user);

// JSON字符串 -> Java对象
User user = mapper.readValue(json, User.class);

// JSON文件 -> Java对象
User user = mapper.readValue(Path.of("user.json"), User.class);

// JSON字符串 -> Map
Map<String, Object> map = mapper.readValue(json,
    new TypeReference<Map<String, Object>>() {});

2.2 Python的json模块

import json

# Python对象 -> JSON字符串
json_str = json.dumps(user)  # user需要是dict/list等

# JSON字符串 -> Python对象
user = json.loads(json_str)

# JSON文件 -> Python对象
with open("user.json") as f:
    user = json.load(f)

# Python对象 -> JSON文件
with open("user.json", "w") as f:
    json.dump(user, f, indent=2)

# 格式化输出
json_str = json.dumps(data, indent=2, ensure_ascii=False)

2.3 数据类序列化

// Java - 需要Jackson注解或Lombok
@JsonInclude(JsonInclude.Include.NON_NULL)
public class User {
    private String name;
    private int age;

    // getter/setter或用Lombok @Data
}

// 配置忽略null
ObjectMapper mapper = new ObjectMapper();
mapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
# Python - dataclass配合
from dataclasses import asdict
from dataclasses import dataclass, field

@dataclass
class User:
    name: str
    age: int = 0

user = User("Alice", 30)
json_str = json.dumps(asdict(user))

2.4 JSON处理对比表

操作Java (Jackson)Python (json)
序列化writeValueAsString(obj)dumps(obj)
反序列化readValue(json, Class)loads(json_str)
格式化writerWithDefaultPrettyPrinter()dumps(obj, indent=2)
日期格式化@JsonFormatdefault参数
忽略nullsetSerializationInclusiondefault参数

三、正则表达式对比

3.1 基本使用

// Java
Pattern pattern = Pattern.compile("^\\d{4}-\\d{2}-\\d{2}$");
Matcher matcher = pattern.matcher("2024-01-15");

if (matcher.matches()) {
    String date = matcher.group(0);
}

// 替换
String result = "hello world".replaceAll("world", "python");

// 分割
String[] parts = "a,b,c".split(",");
# Python
import re

pattern = re.compile(r"^\d{4}-\d{2}-\d{2}$")
if pattern.match("2024-01-15"):
    print("匹配")

# 简单场景不用compile
if re.match(r"^\d{4}-\d{2}-\d{2}$", "2024-01-15"):
    print("匹配")

# 替换
result = re.sub(r"world", "python", "hello world")

# 分割
parts = re.split(r",", "a,b,c")

3.2 常用方法对比

操作JavaPython
编译Pattern.compile()re.compile()
完全匹配matcher.matches()re.fullmatch()
开头匹配matcher.lookingAt()re.match()
搜索matcher.find()re.search()
全部搜索matcher.findAll()re.findall()
替换matcher.replaceAll()re.sub()

3.3 捕获组

// Java - 捕获组
Pattern p = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher m = p.matcher("2024-01-15");
if (m.matches()) {
    String year = m.group(1);    // "2024"
    String month = m.group(2);   // "01"
    String day = m.group(3);     // "15"
}

// 命名捕获组
Pattern p = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})");
Matcher m = p.matcher("2024-01");
m.matches();
String year = m.group("year");
# Python - 捕获组
match = re.match(r"(\d{4})-(\d{2})-(\d{2})", "2024-01-15")
if match:
    year, month, day = match.groups()

# 命名捕获组
match = re.match(r"(?P<year>\d{4})-(?P<month>\d{2})", "2024-01")
year = match.group("year")

3.4 非捕获组

// Java - 非捕获组
Pattern p = Pattern.compile("(?:\\d{4})-(?:\\d{2})");
# Python - 非捕获组
pattern = re.compile(r"(?:\d{4})-(?:\d{2})")

四、日期时间对比

4.1 Java的Date/Time API

// Java 8+ Date/Time API
LocalDate date = LocalDate.now();
LocalTime time = LocalTime.now();
LocalDateTime datetime = LocalDateTime.now();
ZonedDateTime zdt = ZonedDateTime.now();

// 解析
LocalDate parsed = LocalDate.parse("2024-01-15");
LocalDateTime parsed2 = LocalDateTime.parse("2024-01-15T10:30:00");

// 格式化
date.format(DateTimeFormatter.ofPattern("yyyy-MM-dd"));
datetime.format(DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"));

// 计算
date.plusDays(7);
date.minusMonths(1);
datetime.plusHours(2);

// 差异
long days = ChronoUnit.DAYS.between(start, end);

4.2 Python的datetime

from datetime import datetime, date, time, timedelta
from datetime import timezone

# 获取当前时间
now = datetime.now()
today = date.today()

# 解析
parsed = datetime.fromisoformat("2024-01-15")
parsed2 = datetime.strptime("2024-01-15 10:30:00", "%Y-%m-%d %H:%M:%S")

# 格式化
now.strftime("%Y-%m-%d")
now.strftime("%Y-%m-%d %H:%M:%S")

# 计算
later = now + timedelta(days=7)
earlier = now - timedelta(hours=2)

# 时区
from datetime import timezone
utc = datetime.now(timezone.utc)

4.3 日期时间对比表

操作JavaPython
当前日期LocalDate.now()date.today()
当前时间LocalTime.now()datetime.now().time()
当前日期时间LocalDateTime.now()datetime.now()
解析字符串LocalDate.parse()datetime.strptime() / fromisoformat()
格式化.format(formatter).strftime(format)
加减日期.plusDays()+ timedelta()
日期差ChronoUnit.DAYS.between()- timedelta
时区ZonedDateTimepytz / zoneinfo

4.4 时区处理

// Java - ZonedDateTime
ZonedDateTime tokyo = ZonedDateTime.now(ZoneId.of("Asia/Tokyo"));
ZonedDateTime converted = tokyo.withZoneSameInstant(ZoneId.of("America/New_York"));
# Python - zoneinfo(3.9+)
from datetime import datetime
from zoneinfo import ZoneInfo

tokyo = datetime.now(ZoneInfo("Asia/Tokyo"))
converted = tokyo.astimezone(ZoneInfo("America/New_York"))

五、字符串处理对比

5.1 常用方法

String s = "  Hello, World!  ";

s.trim();           // "Hello, World!"
s.toLowerCase();    // "  hello, world!  "
s.toUpperCase();    // "  HELLO, WORLD!  "
s.contains("World"); // true
s.startsWith("  H"); // true
s.endsWith("!  ");   // true
s.indexOf("o");     // 5
s.substring(2, 7);  // "Hello"
s.replace("World", "Python");
s.split(",");       // ["  Hello", " World!  "]
s.isEmpty();        // false
s.isBlank();        // false (Java 11+)
s = "  Hello, World!  "

s.strip()       # "Hello, World!"
s.lower()       # "  hello, world!  "
s.upper()       # "  hello, world!  "
s.startswith("  H")  # True
s.endswith("!  ")    # True
s.find("o")     # 5 (找不到返回-1)
s.index("o")    # 5 (找不到抛异常)
s[2:7]          # "Hello"
s.replace("World", "Python")
s.split(",")    # ["  Hello", " World!  "]
s.isdigit()     # False
s.isalnum()     # False

5.2 字符串拼接

// Java - 多种方式
String s1 = "Hello" + " " + "World";
String s2 = String.join("-", "a", "b", "c");  // "a-b-c"
String s3 = String.format("Name: %s, Age: %d", "Alice", 30);

// Java 15+ Text Block
String json = """
    {
        "name": "Alice",
        "age": 30
    }
    """;
# Python - 多种方式
s1 = "Hello" + " " + "World"
s2 = "-".join(["a", "b", "c"])  # "a-b-c"
s3 = f"Name: {'Alice'}, Age: {30}"

# Python 3.6+ f-string
name = "Alice"
age = 30
s4 = f"Name: {name}, Age: {age}"

# 多行字符串
json = """
{
    "name": "Alice",
    "age": 30
}
"""

5.3 格式化对比

格式JavaPython
字符串%s%s
整数%d%d
浮点数%f%f
宽度%10s%10s
保留小数%.2f%.2f
千分位%,d:,d

六、集合转换

6.1 Stream vs 列表推导式

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);

// 过滤并转换
List<Integer> result = numbers.stream()
    .filter(n -> n % 2 == 0)
    .map(n -> n * n)
    .collect(Collectors.toList());

// 求和
int sum = numbers.stream().mapToInt(Integer::intValue).sum();

// 分组
Map<String, List<Person>> byCity = people.stream()
    .collect(Collectors.groupingBy(Person::getCity));
numbers = [1, 2, 3, 4, 5]

# 过滤并转换
result = [n**2 for n in numbers if n % 2 == 0]

# 求和
total = sum(numbers)

# 分组
from itertools import groupby
by_city = {k: list(v) for k, v in groupby(sorted(people, key=lambda p: p.city), key=lambda p: p.city)}

# 或用pandas
import pandas as pd
df = pd.DataFrame(people)
by_city = df.groupby("city").apply(lambda x: x.to_dict("records"))

七、实战:读取JSON文件并处理

7.1 Java实现

ObjectMapper mapper = new ObjectMapper();
mapper.registerModule(new JavaTimeModule());

Path filePath = Path.of("data.json");
List<User> users = mapper.readValue(filePath,
    mapper.getTypeFactory().constructCollectionType(List.class, User.class));

List<String> names = users.stream()
    .filter(u -> u.getAge() > 20)
    .map(User::getName)
    .sorted()
    .collect(Collectors.toList());

String output = mapper.writeValueAsString(names);
Files.writeString(Path.of("output.json"), output);

7.2 Python实现

import json
from datetime import date

# 读取
with open("data.json") as f:
    users = json.load(f)

# 处理
names = sorted([
    u["name"] for u in users
    if u["age"] > 20
])

# 写入
with open("output.json", "w") as f:
    json.dump(names, f, indent=2)

八、总结

类别Java风格Python风格
文件读取Files.readString()open().read()
JSONObjectMapperjson.dumps()
正则Pattern/Matcherre.search/sub
日期LocalDateTimedatetime
格式化String.format()f-string
列表处理Stream API列表推导式

Python标准库的设计理念是"简单场景简单写"。Java倾向于提供完整但复杂的API,Python则提供简洁但需要组合使用的API。掌握Python的pathlibjsonredatetime四大模块,足以应对日常开发的80%场景。