在openEuler玩转Python

4 阅读9分钟

说起Python,大家都熟悉,但用起来坑也不少!版本管理乱、依赖冲突、性能不给力……今天就来聊聊,怎么在openEuler上玩转Python开发。

一、环境和依赖管理

Python开发最头疼的就是版本管理和依赖冲突。openEuler上可以装多个Python版本,用pyenv或alternatives工具来切换。虚拟环境更是必备,venv最简单,virtualenv功能强,conda适合科学计算,poetry是现代项目首选。依赖管理用requirements.txt是传统方式,pip-tools能锁版本,poetry更优雅。

python多版本管理运行测试代码:

# 查看Python版本
python3 --version

# 安装多版本
dnf install -y python3.9 python3.10 python3.11

# 切换版本
alternatives --config python3

# 使用pyenv管理
curl https://pyenv.run | bash
pyenv install 3.11.5
pyenv global 3.11.5

Python虚拟环境演示:

# venv(Python内置)
python3 -m venv myenv
source myenv/bin/activate
(myenv) $ python --version
Python 3.9.9

# 退出虚拟环境
deactivate

# virtualenv(更强大)
pip3 install virtualenv
virtualenv -p python3.10 myenv
source myenv/bin/activate

# virtualenvwrapper(更方便)
pip3 install virtualenvwrapper
export WORKON_HOME=~/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

mkvirtualenv project1
workon project1
deactivate
rmvirtualenv project1

# conda(科学计算)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda create -n myenv python=3.10
conda activate myenv
conda deactivate

虚拟环境对比:

工具优点缺点适用场景
venv内置,轻量功能简单简单项目
virtualenv功能强大需要安装通用项目
conda管理非Python依赖体积大科学计算
poetry依赖管理优秀学习曲线现代项目

使用pip进行依赖管理:

# requirements.txt
cat > requirements.txt << 'EOF'
flask==2.3.0
requests>=2.28.0
numpy==1.24.3
pandas>=1.5.0,<2.0.0
EOF

# 安装依赖
pip install -r requirements.txt

# 导出依赖
pip freeze > requirements.txt

# pip-tools(锁定版本)
pip install pip-tools
cat > requirements.in << 'EOF'
flask
requests
numpy
pandas
EOF

pip-compile requirements.in
# 生成requirements.txt,包含所有传递依赖

# poetry(现代依赖管理)
pip install poetry
poetry init
poetry add flask requests numpy pandas
poetry install
poetry update

# pyproject.toml
cat pyproject.toml
[tool.poetry]
name = "myproject"
version = "0.1.0"

[tool.poetry.dependencies]
python = "^3.9"
flask = "^2.3.0"
requests = "^2.28.0"

创建Python包也不难,写个setup.py或用pyproject.toml就能打包发布。

# 项目结构
mypackage/
├── mypackage/
│   ├── __init__.py
│   ├── core.py
│   └── utils.py
├── tests/
│   └── test_core.py
├── setup.py
├── README.md
└── LICENSE

# setup.py
cat > setup.py << 'EOF'
from setuptools import setup, find_packages

setup(
    name="mypackage",
    version="0.1.0",
    author="Your Name",
    author_email="your@email.com",
    description="A short description",
    long_description=open("README.md").read(),
    long_description_content_type="text/markdown",
    url="https://github.com/yourusername/mypackage",
    packages=find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    python_requires=">=3.9",
    install_requires=[
        "requests>=2.28.0",
        "numpy>=1.24.0",
    ],
    extras_require={
        "dev": ["pytest", "black", "flake8"],
    },
)
EOF

# 构建包
python setup.py sdist bdist_wheel
ls dist/
mypackage-0.1.0-py3-none-any.whl
mypackage-0.1.0.tar.gz

# 本地安装
pip install -e .

# 上传到PyPI
pip install twine
twine upload dist/*

Python 生态逐渐从传统的 setup.py 转向 基于 **pyproject.toml** 的统一构建方式。这一方式更加标准化、结构清晰,并且与 PEP 517/518 完全兼容,是当前最推荐的打包方法。下面给大家演示一下:

# 现代打包方式
cat > pyproject.toml << 'EOF'
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "mypackage"
version = "0.1.0"
authors = [
    {name = "Your Name", email = "your@email.com"},
]
description = "A short description"
readme = "README.md"
requires-python = ">=3.9"
classifiers = [
    "Programming Language :: Python :: 3",
    "License :: OSI Approved :: MIT License",
]
dependencies = [
    "requests>=2.28.0",
    "numpy>=1.24.0",
]

[project.optional-dependencies]
dev = ["pytest", "black", "flake8"]

[project.urls]
Homepage = "https://github.com/yourusername/mypackage"
EOF

# 构建
pip install build
python -m build

二、性能优化

Python慢是出了名的,但有办法让它快起来。先用性能分析工具找出瓶颈在哪,cProfile看整体耗时,line_profiler看每行代码,memory_profiler查内存。然后针对性优化:列表推导式比循环快,生成器省内存,set查找比list快几千倍,内置函数都是C写的特别快。NumPy做数值计算能快几十倍,多进程跑满CPU,实在不行就上Cython或调C库。

# cProfile(内置)
python -m cProfile -s cumulative script.py
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.234    0.000    1.234    0.001 module.py:10(slow_function)
     5000    0.567    0.000    0.567    0.000 {built-in method builtins.sum}

# 在代码中使用
import cProfile
import pstats

profiler = cProfile.Profile()
profiler.enable()

# 要分析的代码
result = slow_function()

profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10)

# line_profiler(逐行分析)
pip install line_profiler

# 添加装饰器
@profile
def slow_function():
    result = []
    for i in range(1000000):
        result.append(i * 2)
    return result

# 运行
kernprof -l -v script.py
Line #      Hits         Time  Per Hit   % Time  Line Contents
     3         1      234.5    234.5     15.2    result = []
     4   1000000   1234.5      0.0     80.1    for i in range(1000000):
     5   1000000     71.2      0.0      4.7        result.append(i * 2)

# memory_profiler(内存分析)
pip install memory_profiler

@profile
def memory_intensive():
    big_list = [i for i in range(10000000)]
    return sum(big_list)

python -m memory_profiler script.py
Line #    Mem usage    Increment   Line Contents
     3     50.2 MiB     50.2 MiB   big_list = [i for i in range(10000000)]
     4    434.5 MiB    384.3 MiB   return sum(big_list)

进行性能测试:

# timeit(微基准测试)
python -m timeit "sum(range(1000))"
10000 loops, best of 5: 23.4 usec per loop

# 在代码中使用
import timeit

def test_function():
    return sum(range(1000))

time = timeit.timeit(test_function, number=10000)
print(f"Average time: {time/10000*1000:.3f} ms")

# pytest-benchmark
pip install pytest-benchmark

def test_sum(benchmark):
    result = benchmark(sum, range(1000))
    assert result == 499500

pytest test_benchmark.py
Name (time in us)    Min      Max     Mean   StdDev
test_sum          23.4    45.6    24.1     1.2

# 优化前:循环
result = []
for i in range(1000000):
    result.append(i * 2)
# 时间: 234ms

# 优化后:列表推导式
result = [i * 2 for i in range(1000000)]
# 时间: 123ms (提升47%)

# 生成器(节省内存)
result = (i * 2 for i in range(1000000))
# 内存: 从400MB降到88字节

# map函数
result = list(map(lambda x: x * 2, range(1000000)))
# 时间: 145ms
# 优化前:手动求和
total = 0
for i in range(1000000):
    total += i
# 时间: 67ms

# 优化后:内置sum
total = sum(range(1000000))
# 时间: 12ms (提升82%)

# 优化前:手动查找
found = False
for item in big_list:
    if item == target:
        found = True
        break
# 时间: 45ms

# 优化后:in操作符
found = target in big_list
# 时间: 8ms (提升82%)

# 使用set加速查找
big_set = set(big_list)
found = target in big_set
# 时间: 0.001ms (提升99.9%)

性能对比:

操作手动实现内置函数提升
求和67ms12ms82%
查找45ms8ms82%
set查找45ms0.001ms99.9%
import numpy as np

# 优化前:Python循环
data = list(range(1000000))
result = [x * 2 + 1 for x in data]
# 时间: 234ms

# 优化后:NumPy向量化
data = np.arange(1000000)
result = data * 2 + 1
# 时间: 5ms (提升98%)

# 矩阵运算
# Python循环
matrix = [[i*j for j in range(1000)] for i in range(1000)]
# 时间: 567ms

# NumPy
matrix = np.arange(1000).reshape(1000, 1) * np.arange(1000)
# 时间: 12ms (提升98%)

from multiprocessing import Pool
import time

def cpu_intensive(n):
    return sum(i*i for i in range(n))

# 串行
start = time.time()
results = [cpu_intensive(10000000) for _ in range(8)]
print(f"Serial: {time.time() - start:.2f}s")
# 时间: 12.5s

# 并行
start = time.time()
with Pool(8) as pool:
    results = pool.map(cpu_intensive, [10000000]*8)
print(f"Parallel: {time.time() - start:.2f}s")
# 时间: 1.8s (提升86%)

# 使用concurrent.futures
from concurrent.futures import ProcessPoolExecutor

with ProcessPoolExecutor(max_workers=8) as executor:
    results = list(executor.map(cpu_intensive, [10000000]*8))

# pure_python.py
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# 时间: fibonacci(35) = 3.2s

# cython_version.pyx
def fibonacci_cython(int n):
    if n <= 1:
        return n
    return fibonacci_cython(n-1) + fibonacci_cython(n-2)

# setup.py
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize("cython_version.pyx")
)

# 编译
python setup.py build_ext --inplace

# 使用
import cython_version
result = cython_version.fibonacci_cython(35)
# 时间: 0.8s (提升75%)

# 添加类型声明
cpdef long fibonacci_typed(long n):
    if n <= 1:
        return n
    return fibonacci_typed(n-1) + fibonacci_typed(n-2)
# 时间: 0.3s (提升91%)

# mylib.c
#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

void process_array(int *arr, int size) {
    for (int i = 0; i < size; i++) {
        arr[i] *= 2;
    }
}

# 编译
gcc -shared -fPIC -o mylib.so mylib.c

# Python调用
import ctypes
import numpy as np

# 加载库
lib = ctypes.CDLL('./mylib.so')

# 简单函数
lib.add.argtypes = [ctypes.c_int, ctypes.c_int]
lib.add.restype = ctypes.c_int
result = lib.add(3, 4)

# 数组处理
lib.process_array.argtypes = [ctypes.POINTER(ctypes.c_int), ctypes.c_int]
arr = np.array([1, 2, 3, 4, 5], dtype=np.int32)
lib.process_array(arr.ctypes.data_as(ctypes.POINTER(ctypes.c_int)), len(arr))

# 性能对比
# Python: 234ms
# C扩展: 5ms (提升98%)

# 优化前:普通类
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

points = [Point(i, i) for i in range(1000000)]
# 内存: 234MB

# 优化后:使用__slots__
class PointSlots:
    __slots__ = ['x', 'y']
    def __init__(self, x, y):
        self.x = x
        self.y = y

points = [PointSlots(i, i) for i in range(1000000)]
# 内存: 156MB (节省33%)

# 使用namedtuple
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
points = [Point(i, i) for i in range(1000000)]
# 内存: 123MB (节省47%)
# 优化前:列表
def read_large_file(filename):
    with open(filename) as f:
        return f.readlines()  # 一次性读入内存

lines = read_large_file('huge.txt')  # 10GB文件
# 内存: 10GB

# 优化后:生成器
def read_large_file_gen(filename):
    with open(filename) as f:
        for line in f:
            yield line.strip()

lines = read_large_file_gen('huge.txt')
# 内存: 几KB

# 处理数据
for line in lines:
    process(line)  # 逐行处理

三、开发工具

写代码不能光跑得通就行,还得规范、可维护。black能自动格式化代码,isort排import,flake8和pylint检查代码质量,mypy做类型检查。pytest跑测试,coverage看覆盖率。项目结构要清晰,源码、测试、文档分开放,用pyproject.toml统一管理配置。

格式化代码:

# black(自动格式化)
pip install black
black myproject/
reformatted myproject/main.py
All done! ✨ 🍰 ✨

# isort(导入排序)
pip install isort
isort myproject/

# autopep8(PEP8格式化)
pip install autopep8
autopep8 --in-place --aggressive script.py

静态分析测试:

# flake8(代码检查)
pip install flake8
flake8 myproject/
myproject/main.py:10:1: E302 expected 2 blank lines, found 1
myproject/main.py:15:80: E501 line too long (85 > 79 characters)

# pylint(更严格)
pip install pylint
pylint myproject/
************* Module myproject.main
myproject/main.py:10:0: C0303: Trailing whitespace (trailing-whitespace)
myproject/main.py:15:0: C0301: Line too long (85/79) (line-too-long)

Your code has been rated at 8.5/10

# mypy(类型检查)
pip install mypy
mypy myproject/
myproject/main.py:10: error: Argument 1 to "process" has incompatible type "str"; expected "int"

测试覆盖率:

# pytest + coverage
pip install pytest pytest-cov

# 运行测试并生成覆盖率报告
pytest --cov=mypackage --cov-report=html tests/

# 查看报告
Name                Stmts   Miss  Cover
---------------------------------------
mypackage/__init__.py      5      0   100%
mypackage/core.py        123     12    90%
mypackage/utils.py        45      5    89%
---------------------------------------
TOTAL                    173     17    90%

# HTML报告
firefox htmlcov/index.html

myproject/
├── mypackage/           # 源代码
│   ├── __init__.py
│   ├── core.py
│   └── utils.py
├── tests/               # 测试
│   ├── __init__.py
│   ├── test_core.py
│   └── test_utils.py
├── docs/                # 文档
│   ├── conf.py
│   └── index.rst
├── .github/             # CI/CD
│   └── workflows/
│       └── test.yml
├── pyproject.toml       # 项目配置
├── setup.py             # 安装脚本
├── README.md
├── LICENSE
└── .gitignore

Python开发的关键:环境隔离用虚拟环境,依赖管理用poetry,性能优化用NumPy和多进程,代码质量用black和pytest。openEuler提供完善的Python支持,从多版本管理到性能优化工具,帮助开发者高效构建高质量应用。

  1. 使用内置函数和数据结构
  2. 列表推导式代替循环
  3. 生成器节省内存
  4. NumPy向量化计算
  5. 多进程并行CPU密集任务
  6. Cython优化热点代码
  7. 使用__slots__减少内存
  8. 缓存重复计算结果
  9. 避免全局变量
  10. 使用局部变量

四、总结

总体来看,Python 的开发体验能否顺畅,取决于环境隔离是否规范、依赖管理是否可靠、性能优化是否到位,而在 openEuler 上,从多版本管理到虚拟环境、从现代依赖工具到全套调优手段,都能把这一整套流程跑得更稳、更快,让 Python 开发真正高效,可维护。

如果您正在寻找面向未来的开源操作系统,不妨看看DistroWatch 榜单中快速上升的 openEuler:distrowatch.com/table-mobil…,一个由开放原子开源基金会孵化、支持“超节点”场景的Linux 发行版。 openEuler官网:www.openeuler.openatom.cn/zh/