说起Python,大家都熟悉,但用起来坑也不少!版本管理乱、依赖冲突、性能不给力……今天就来聊聊,怎么在openEuler上玩转Python开发。
一、环境和依赖管理
Python开发最头疼的就是版本管理和依赖冲突。openEuler上可以装多个Python版本,用pyenv或alternatives工具来切换。虚拟环境更是必备,venv最简单,virtualenv功能强,conda适合科学计算,poetry是现代项目首选。依赖管理用requirements.txt是传统方式,pip-tools能锁版本,poetry更优雅。
python多版本管理运行测试代码:
# 查看Python版本
python3 --version
# 安装多版本
dnf install -y python3.9 python3.10 python3.11
# 切换版本
alternatives --config python3
# 使用pyenv管理
curl https://pyenv.run | bash
pyenv install 3.11.5
pyenv global 3.11.5
Python虚拟环境演示:
# venv(Python内置)
python3 -m venv myenv
source myenv/bin/activate
(myenv) $ python --version
Python 3.9.9
# 退出虚拟环境
deactivate
# virtualenv(更强大)
pip3 install virtualenv
virtualenv -p python3.10 myenv
source myenv/bin/activate
# virtualenvwrapper(更方便)
pip3 install virtualenvwrapper
export WORKON_HOME=~/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh
mkvirtualenv project1
workon project1
deactivate
rmvirtualenv project1
# conda(科学计算)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda create -n myenv python=3.10
conda activate myenv
conda deactivate
虚拟环境对比:
| 工具 | 优点 | 缺点 | 适用场景 |
|---|---|---|---|
| venv | 内置,轻量 | 功能简单 | 简单项目 |
| virtualenv | 功能强大 | 需要安装 | 通用项目 |
| conda | 管理非Python依赖 | 体积大 | 科学计算 |
| poetry | 依赖管理优秀 | 学习曲线 | 现代项目 |
使用pip进行依赖管理:
# requirements.txt
cat > requirements.txt << 'EOF'
flask==2.3.0
requests>=2.28.0
numpy==1.24.3
pandas>=1.5.0,<2.0.0
EOF
# 安装依赖
pip install -r requirements.txt
# 导出依赖
pip freeze > requirements.txt
# pip-tools(锁定版本)
pip install pip-tools
cat > requirements.in << 'EOF'
flask
requests
numpy
pandas
EOF
pip-compile requirements.in
# 生成requirements.txt,包含所有传递依赖
# poetry(现代依赖管理)
pip install poetry
poetry init
poetry add flask requests numpy pandas
poetry install
poetry update
# pyproject.toml
cat pyproject.toml
[tool.poetry]
name = "myproject"
version = "0.1.0"
[tool.poetry.dependencies]
python = "^3.9"
flask = "^2.3.0"
requests = "^2.28.0"
创建Python包也不难,写个setup.py或用pyproject.toml就能打包发布。
# 项目结构
mypackage/
├── mypackage/
│ ├── __init__.py
│ ├── core.py
│ └── utils.py
├── tests/
│ └── test_core.py
├── setup.py
├── README.md
└── LICENSE
# setup.py
cat > setup.py << 'EOF'
from setuptools import setup, find_packages
setup(
name="mypackage",
version="0.1.0",
author="Your Name",
author_email="your@email.com",
description="A short description",
long_description=open("README.md").read(),
long_description_content_type="text/markdown",
url="https://github.com/yourusername/mypackage",
packages=find_packages(),
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
python_requires=">=3.9",
install_requires=[
"requests>=2.28.0",
"numpy>=1.24.0",
],
extras_require={
"dev": ["pytest", "black", "flake8"],
},
)
EOF
# 构建包
python setup.py sdist bdist_wheel
ls dist/
mypackage-0.1.0-py3-none-any.whl
mypackage-0.1.0.tar.gz
# 本地安装
pip install -e .
# 上传到PyPI
pip install twine
twine upload dist/*
Python 生态逐渐从传统的 setup.py 转向 基于 **pyproject.toml** 的统一构建方式。这一方式更加标准化、结构清晰,并且与 PEP 517/518 完全兼容,是当前最推荐的打包方法。下面给大家演示一下:
# 现代打包方式
cat > pyproject.toml << 'EOF'
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "mypackage"
version = "0.1.0"
authors = [
{name = "Your Name", email = "your@email.com"},
]
description = "A short description"
readme = "README.md"
requires-python = ">=3.9"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
]
dependencies = [
"requests>=2.28.0",
"numpy>=1.24.0",
]
[project.optional-dependencies]
dev = ["pytest", "black", "flake8"]
[project.urls]
Homepage = "https://github.com/yourusername/mypackage"
EOF
# 构建
pip install build
python -m build
二、性能优化
Python慢是出了名的,但有办法让它快起来。先用性能分析工具找出瓶颈在哪,cProfile看整体耗时,line_profiler看每行代码,memory_profiler查内存。然后针对性优化:列表推导式比循环快,生成器省内存,set查找比list快几千倍,内置函数都是C写的特别快。NumPy做数值计算能快几十倍,多进程跑满CPU,实在不行就上Cython或调C库。
# cProfile(内置)
python -m cProfile -s cumulative script.py
ncalls tottime percall cumtime percall filename:lineno(function)
1000 0.234 0.000 1.234 0.001 module.py:10(slow_function)
5000 0.567 0.000 0.567 0.000 {built-in method builtins.sum}
# 在代码中使用
import cProfile
import pstats
profiler = cProfile.Profile()
profiler.enable()
# 要分析的代码
result = slow_function()
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10)
# line_profiler(逐行分析)
pip install line_profiler
# 添加装饰器
@profile
def slow_function():
result = []
for i in range(1000000):
result.append(i * 2)
return result
# 运行
kernprof -l -v script.py
Line # Hits Time Per Hit % Time Line Contents
3 1 234.5 234.5 15.2 result = []
4 1000000 1234.5 0.0 80.1 for i in range(1000000):
5 1000000 71.2 0.0 4.7 result.append(i * 2)
# memory_profiler(内存分析)
pip install memory_profiler
@profile
def memory_intensive():
big_list = [i for i in range(10000000)]
return sum(big_list)
python -m memory_profiler script.py
Line # Mem usage Increment Line Contents
3 50.2 MiB 50.2 MiB big_list = [i for i in range(10000000)]
4 434.5 MiB 384.3 MiB return sum(big_list)
进行性能测试:
# timeit(微基准测试)
python -m timeit "sum(range(1000))"
10000 loops, best of 5: 23.4 usec per loop
# 在代码中使用
import timeit
def test_function():
return sum(range(1000))
time = timeit.timeit(test_function, number=10000)
print(f"Average time: {time/10000*1000:.3f} ms")
# pytest-benchmark
pip install pytest-benchmark
def test_sum(benchmark):
result = benchmark(sum, range(1000))
assert result == 499500
pytest test_benchmark.py
Name (time in us) Min Max Mean StdDev
test_sum 23.4 45.6 24.1 1.2
# 优化前:循环
result = []
for i in range(1000000):
result.append(i * 2)
# 时间: 234ms
# 优化后:列表推导式
result = [i * 2 for i in range(1000000)]
# 时间: 123ms (提升47%)
# 生成器(节省内存)
result = (i * 2 for i in range(1000000))
# 内存: 从400MB降到88字节
# map函数
result = list(map(lambda x: x * 2, range(1000000)))
# 时间: 145ms
# 优化前:手动求和
total = 0
for i in range(1000000):
total += i
# 时间: 67ms
# 优化后:内置sum
total = sum(range(1000000))
# 时间: 12ms (提升82%)
# 优化前:手动查找
found = False
for item in big_list:
if item == target:
found = True
break
# 时间: 45ms
# 优化后:in操作符
found = target in big_list
# 时间: 8ms (提升82%)
# 使用set加速查找
big_set = set(big_list)
found = target in big_set
# 时间: 0.001ms (提升99.9%)
性能对比:
| 操作 | 手动实现 | 内置函数 | 提升 |
|---|---|---|---|
| 求和 | 67ms | 12ms | 82% |
| 查找 | 45ms | 8ms | 82% |
| set查找 | 45ms | 0.001ms | 99.9% |
import numpy as np
# 优化前:Python循环
data = list(range(1000000))
result = [x * 2 + 1 for x in data]
# 时间: 234ms
# 优化后:NumPy向量化
data = np.arange(1000000)
result = data * 2 + 1
# 时间: 5ms (提升98%)
# 矩阵运算
# Python循环
matrix = [[i*j for j in range(1000)] for i in range(1000)]
# 时间: 567ms
# NumPy
matrix = np.arange(1000).reshape(1000, 1) * np.arange(1000)
# 时间: 12ms (提升98%)
from multiprocessing import Pool
import time
def cpu_intensive(n):
return sum(i*i for i in range(n))
# 串行
start = time.time()
results = [cpu_intensive(10000000) for _ in range(8)]
print(f"Serial: {time.time() - start:.2f}s")
# 时间: 12.5s
# 并行
start = time.time()
with Pool(8) as pool:
results = pool.map(cpu_intensive, [10000000]*8)
print(f"Parallel: {time.time() - start:.2f}s")
# 时间: 1.8s (提升86%)
# 使用concurrent.futures
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=8) as executor:
results = list(executor.map(cpu_intensive, [10000000]*8))
# pure_python.py
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
# 时间: fibonacci(35) = 3.2s
# cython_version.pyx
def fibonacci_cython(int n):
if n <= 1:
return n
return fibonacci_cython(n-1) + fibonacci_cython(n-2)
# setup.py
from setuptools import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("cython_version.pyx")
)
# 编译
python setup.py build_ext --inplace
# 使用
import cython_version
result = cython_version.fibonacci_cython(35)
# 时间: 0.8s (提升75%)
# 添加类型声明
cpdef long fibonacci_typed(long n):
if n <= 1:
return n
return fibonacci_typed(n-1) + fibonacci_typed(n-2)
# 时间: 0.3s (提升91%)
# mylib.c
#include <stdio.h>
int add(int a, int b) {
return a + b;
}
void process_array(int *arr, int size) {
for (int i = 0; i < size; i++) {
arr[i] *= 2;
}
}
# 编译
gcc -shared -fPIC -o mylib.so mylib.c
# Python调用
import ctypes
import numpy as np
# 加载库
lib = ctypes.CDLL('./mylib.so')
# 简单函数
lib.add.argtypes = [ctypes.c_int, ctypes.c_int]
lib.add.restype = ctypes.c_int
result = lib.add(3, 4)
# 数组处理
lib.process_array.argtypes = [ctypes.POINTER(ctypes.c_int), ctypes.c_int]
arr = np.array([1, 2, 3, 4, 5], dtype=np.int32)
lib.process_array(arr.ctypes.data_as(ctypes.POINTER(ctypes.c_int)), len(arr))
# 性能对比
# Python: 234ms
# C扩展: 5ms (提升98%)
# 优化前:普通类
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
points = [Point(i, i) for i in range(1000000)]
# 内存: 234MB
# 优化后:使用__slots__
class PointSlots:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
points = [PointSlots(i, i) for i in range(1000000)]
# 内存: 156MB (节省33%)
# 使用namedtuple
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
points = [Point(i, i) for i in range(1000000)]
# 内存: 123MB (节省47%)
# 优化前:列表
def read_large_file(filename):
with open(filename) as f:
return f.readlines() # 一次性读入内存
lines = read_large_file('huge.txt') # 10GB文件
# 内存: 10GB
# 优化后:生成器
def read_large_file_gen(filename):
with open(filename) as f:
for line in f:
yield line.strip()
lines = read_large_file_gen('huge.txt')
# 内存: 几KB
# 处理数据
for line in lines:
process(line) # 逐行处理
三、开发工具
写代码不能光跑得通就行,还得规范、可维护。black能自动格式化代码,isort排import,flake8和pylint检查代码质量,mypy做类型检查。pytest跑测试,coverage看覆盖率。项目结构要清晰,源码、测试、文档分开放,用pyproject.toml统一管理配置。
格式化代码:
# black(自动格式化)
pip install black
black myproject/
reformatted myproject/main.py
All done! ✨ 🍰 ✨
# isort(导入排序)
pip install isort
isort myproject/
# autopep8(PEP8格式化)
pip install autopep8
autopep8 --in-place --aggressive script.py
静态分析测试:
# flake8(代码检查)
pip install flake8
flake8 myproject/
myproject/main.py:10:1: E302 expected 2 blank lines, found 1
myproject/main.py:15:80: E501 line too long (85 > 79 characters)
# pylint(更严格)
pip install pylint
pylint myproject/
************* Module myproject.main
myproject/main.py:10:0: C0303: Trailing whitespace (trailing-whitespace)
myproject/main.py:15:0: C0301: Line too long (85/79) (line-too-long)
Your code has been rated at 8.5/10
# mypy(类型检查)
pip install mypy
mypy myproject/
myproject/main.py:10: error: Argument 1 to "process" has incompatible type "str"; expected "int"
测试覆盖率:
# pytest + coverage
pip install pytest pytest-cov
# 运行测试并生成覆盖率报告
pytest --cov=mypackage --cov-report=html tests/
# 查看报告
Name Stmts Miss Cover
---------------------------------------
mypackage/__init__.py 5 0 100%
mypackage/core.py 123 12 90%
mypackage/utils.py 45 5 89%
---------------------------------------
TOTAL 173 17 90%
# HTML报告
firefox htmlcov/index.html
myproject/
├── mypackage/ # 源代码
│ ├── __init__.py
│ ├── core.py
│ └── utils.py
├── tests/ # 测试
│ ├── __init__.py
│ ├── test_core.py
│ └── test_utils.py
├── docs/ # 文档
│ ├── conf.py
│ └── index.rst
├── .github/ # CI/CD
│ └── workflows/
│ └── test.yml
├── pyproject.toml # 项目配置
├── setup.py # 安装脚本
├── README.md
├── LICENSE
└── .gitignore
Python开发的关键:环境隔离用虚拟环境,依赖管理用poetry,性能优化用NumPy和多进程,代码质量用black和pytest。openEuler提供完善的Python支持,从多版本管理到性能优化工具,帮助开发者高效构建高质量应用。
- 使用内置函数和数据结构
- 列表推导式代替循环
- 生成器节省内存
- NumPy向量化计算
- 多进程并行CPU密集任务
- Cython优化热点代码
- 使用__slots__减少内存
- 缓存重复计算结果
- 避免全局变量
- 使用局部变量
四、总结
总体来看,Python 的开发体验能否顺畅,取决于环境隔离是否规范、依赖管理是否可靠、性能优化是否到位,而在 openEuler 上,从多版本管理到虚拟环境、从现代依赖工具到全套调优手段,都能把这一整套流程跑得更稳、更快,让 Python 开发真正高效,可维护。
如果您正在寻找面向未来的开源操作系统,不妨看看DistroWatch 榜单中快速上升的 openEuler:distrowatch.com/table-mobil…,一个由开放原子开源基金会孵化、支持“超节点”场景的Linux 发行版。 openEuler官网:www.openeuler.openatom.cn/zh/