AI工程师第一课 - PythonPython环境搭建与基础语法教程，涵盖pyenv版本管理、venv虚拟环境、pip包

安装`Pyhon`

访问Python官网下载安装后，需要将路径添加到系统环境变量中，以便全局可用（各系统配置方式有所不同，自行查询配置）。

Mac 也可通过brew安装，

brew install python

注意python版本

安装配置好后,在终端查看python3 --version 有版本输出则安装成功。

版本管理`pyenv`

可以暂时不用太关注，等碰到兼容问题时再回头看看。当前为mac系统是使用homebrew安装的，其他系统自行查找。

pyenv是用来管理切换python版本，为什么需要安装它，如上图所示的python版本所示，某些库可能不兼容低版本，或者不支持高版本，这时如果要继续使用这个库就要使用它支持的版本了。

安装pyenv

brew install pyenv

添加环境变量

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc
echo 'eval "$(pyenv init - zsh)"' >> ~/.zshrc

# 使配置生效
exec "$SHELL"

测试有输出版本号则安装成功。

pyenv -v

正常情况下使用pyenv安装某个python版本时会尽力下载编译所需的依赖，但有时会因缺少某些系统依赖而失败。需要提前安装好这些系统依赖。

brew install openssl@3 readline sqlite3 xz tcl-tk@8 libb2 zstd zlib pkgconfig

安装没问题后，我们安装制定的版本，通过上述版本信息看，版本3.12应该是使用最多的了。

pyenv install 3.12

虚拟环境`venv`

Python的虚拟环境venv，可以创建一个虚拟环境，并安装需要的库，避免全局安装，从而避免版本冲突。

# 创建虚拟环境
python3 -m venv .venv

# 激活虚拟环境
source .venv/bin/activate

# 退出虚拟环境
deactivate

包管理器`pip`

pip 是Python自带的默认包管理器，可以用来安装、更新、卸载软件包。

pip install requests

写出我的`Hello World`

创建一个python项目，并创建一个main.py文件，并写入以下内容。

print("Hello World")

在终端运行python3 main.py，可以看到控制台的输出。

基础知识回顾

数据类型、变量

只能由数字、字母、下划线组成，不能以数字开头，不能有特殊字符。

name = "hboot"
age = 18
user_name = 'admin'

5种核心数据类型：Number \ String \ Boolean \ List \ Dictionary,数字0、空字符串''、空列表[]等逻辑上都等于False.

list = []
list.append(18)

dict = {"name": "hboot", "age": 18}
dict["name"] = "hboot"

函数

使用def声明函数，可以设置参数默认值，但必须放到最后；

def get_name(name,age=18):
    return f"hello, {name}! You are {age} years old."

print(get_name(dict["name"]))

可变参数*args,接受任意数量的参数，打包为元组。**kwargs 接受任意数量的参数，打包为字典。

def get_name(*args, **kwargs):
    print(f"hello, {args[0]}! You are {kwargs['age']} years old.")

get_name("hboot", age=18)

lambda匿名函数，适合短小的函数，并且不需要定义函数名。

get_name = lambda name, age=18: f"hello, {name}! You are {age} years old."
print(get_name(dict["name"]))

装饰器，通过使用@符号，将函数作为参数传入，并返回一个新函数。可以在不修改内部代码的情况下，增加功能。

def decorator(func):
    def wrapper(*args, **kwargs):
        print("Before function execution.")
        result = func(*args, **kwargs)
        print("After function execution.")
        return result

    return wrapper

@decorator
def func():
    print("Function execution.")
    return "Function result."

print(func())

类`Class` 和对象`Object`

Class是创建骨架结构，Object则是具体的事物。

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        print(f"{self.name} is {self.age} years old.")

    def say_hello(self):
        print(f"Hello, my name is {self.name}.")
        print(f"I am {self.age} years old.")


person = Person("Alice", 25)
person.say_hello()

__init__ 是类的构造函数，在创建对象时调用。self 是指向当前对象的引用，用于访问对象的属性和方法。

封装、继承、多态，类就是一类事物的封装，通过实例化得到具体事物对象。

继承是站在巨人的肩膀上，继承了别人的属性和方法，然后进行扩展。

class AdvancedPerson(Person):
    def __init__(self, name, age):
        super().__init__(name, age)
        self.skills = []

    def add_skill(self, skill):
        self.skills.append(skill)
        print(f"{self.name} has learned {skill}.")

advanced_person = AdvancedPerson("Bob", 30)
advanced_person.add_skill("Python")

还可以通过覆盖Python内部方法，来修改默认行为。

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __str__(self):
        return f"{self.name} is {self.age} years old."
    def __len__(self):
        return len(self.name)


person = Person("Alice", 25)
print(person)
print(len(person))

条件判断和循环语句

通过if 条件判断,逻辑组合and \ or \ not.

age = 18

if age < 18:
    print("You are a minor.")
if age >= 18 and age < 65:
    print("You are an adult.")
else:
    print("You are an senior.")

通过for遍历循环。range()可以用来生成数字序列。

list = ['apple', 'banana', 'orange']

for item in list:
    print(item)

for i in range(1, 6):
    print(i)

通过while循环，直到条件为假才退出循环。

i = 1
while i <= 5:
    print(i)
    i += 1

在循环中，可以使用break和continue来控制循环的流程。break用于退出循环，continue用于跳过当前循环的剩余部分，继续下一次循环。

有时候为了简洁，列表推导式可以把好几行的if \ for语句放到一行，运行速度也快。

list = []
# if \ for 语句
for i in range(1, 6):
    if i % 2 == 0:
        list.append(i)

# 列表推导式
list = [x for x in range(1, 6) if x % 2 == 0]
print(list)

文件操作`I/O`

通过使用with open(...)语句对文件进行读写操作。

with open('log.txt','w',encoding='utf-8') as f:
    f.write('Hello World')

可以看到统计目录下多了一文件log.txt，内容为Hello World。

读取文件内容，调整打开文件的模式为r。

with open('log.txt','r',encoding='utf-8') as f:
    print(f.read())

一次性读取，当处理大文件时，会直接撑爆内存，可以一次读取一行。

with open('log.txt','r',encoding='utf-8') as f:
    for line in f:
        print(line)
        print(line.strip()) # 去除换行符

模块导入

通过使用import导入模块。

import math

print(math.sqrt(16))

有些库名比较长，为了方便，可以简化导入模块名

import math as m

print(m.sqrt(16))

如果我们只需要模块中的某个功能时，可以指定名称导入。

from math import sqrt

print(sqrt(16))

错误捕获

通过使用try...except...finally语句来捕获错误。

try:
    # print(1/0)
    # print(math.sqrt(-1))
    print(int('a'))

except ZeroDivisionError:
    print("Cannot divide by zero.")
except ValueError:
    print("Invalid value.")
except:
    print("An error occurred.")
finally:
    print("Finally block executed.")

捕获具体的错误类型，以给出合适的错误信息提示。

核心知识进阶

类型注解

可以直接标记数据类型。

name: str = "hboot"

def get_name(name: str) -> str:
    return f"hello, {name}!"

print(get_name(name))

Python不会强制进行类型校验，仅用于工具检测提示以及方便维护。

生成器

为了节省内存，使用生成器惰性求值。

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

for i in fibonacci():
    if i > 100:
        break
    print(i)

上下文管理

为了确保资源的关闭和释放。@contextmanager装饰器可以将函数变为上下文管理器，使得函数可以被with语句调用。

from contextlib import contextmanager

@contextmanager
def open_file(filename,mode):
    f = open(filename,mode)
    try:
        yield f
    finally:
        f.close()

with open_file('log.txt','r') as f:
    print(f.read())

魔法`Pythonic`

之前class类已经演示过的__init__和__str__方法，还有很多类似的方法。通过定义这些方法，使得类对象可以更优雅的进行操作。

比如上面刚学习过的上下文管理器,通过实现一个__enter__和__exit__方法，使得类对象可以像with语句一样进行操作。

class OpenFile:
    def __init__(self,filename,mode='r'):
        self.filename = filename
        self.mode = mode   

    def __enter__(self):
        self.f = open(self.filename,self.mode)
        return self.f

    def __exit__(self,exc_type,exc_val,exc_tb):
        if self.f:
            self.f.close()

with OpenFile('log.txt') as f:
    print(f.read())

异步编程

和前端语法一样，通过async和await关键字实现异步编程。

import asyncio

async def hello(name: str):
    print(f'Hello {name}!')
    await asyncio.sleep(1)
    print(f'Bye {name}!')
    
asyncio.run(hello('hboot'))
print("hello world")

asyncio.sleep() 用于演示异步调用，释放线程控制权。asyncio.run()运行异步函数，它会阻塞当前线程，直到所有任务完成。

想要并发的运行多个任务，使用asyncio.gather()并发运行，事件循环会在它们之间交替运行。

async def batch_hello():
    
    await asyncio.gather(
        hello('Alice'),
        hello('Bob'),
        hello('Charlie')
    )
    print('All done!')
    
asyncio.run(batch_hello())

通过使用信号量Semaphore控制并发。

import asyncio

async def hello(name: str,sem: asyncio.Semaphore):
    
    async with sem:
        print(f'Hello {name}!')
        await asyncio.sleep(1)
        print(f'Bye {name}!')

async def batch_hello():
    # 控制并发3
    sem = asyncio.Semaphore(3)
    
    tasks = [hello(f'name{i}',sem) for i in range(10)]
    await asyncio.gather(*tasks)
    print('All done!')
    
asyncio.run(batch_hello())

相比于asyncio.gather()要等批次完成，使用asyncio.as_completed()完成一个就处理一个，不用互相等。

这里我改变了不同任务的执行完成时间，执行代码后可以很清楚的看到有一个任务完成，就立马加入一个新的任务。

import asyncio
import random as RANDOM

async def hello(name: str,sem: asyncio.Semaphore):
    
    async with sem:
        print(f'Hello {name}!')
        await asyncio.sleep(RANDOM.randint(1,5))
        print(f'Bye {name}!')

async def batch_hello():
    # 控制并发3
    sem = asyncio.Semaphore(3)
    
    tasks = [hello(f'name{i}',sem) for i in range(10)]
    # await asyncio.gather(*tasks)
    
    for task in asyncio.as_completed(tasks):
        await task
        
    print('All done!')
    
asyncio.run(batch_hello())

如果还要对任务进行更精细的控制，可以使用asyncio.wait,设置超时、取消任务等。

常用标准库

Python标准库提供了丰富的功能，这里主要介绍一些常用的。

`json`

json模块提供了JSON编码和解码功能。

import json

# 序列化为str
print(json.dumps({"name": "hboot", "age": 18}))
# 反序列化为dict
print(json.loads('{"name": "hboot", "age": 18}'))

`os`

os模块提供了操作系统功能。

import os

# 创建文件夹
os.mkdir("test")
# 删除文件夹
os.rmdir("test")

# 获取环境变量
print(os.environ.get("PATH"))

# 执行系统命令
os.system("ls")

`sys`

sys模块提供了与Python解释器相关的功能。

import sys

# 获取命令行参数
print(sys.argv)

# 获取Python解释器的版本信息
print(sys.version)

# 获取Python解释器的实现信息
print(sys.implementation)

# 获取Python解释器的平台信息
print(sys.platform)

# 获取Python解释器的路径
print(sys.executable)

`pathlib`

pathlib模块提供了路径对象，用于处理文件和目录的路径。

from pathlib import Path

# 获取当前目录
print(Path.cwd())

# 获取当前目录的父目录
print(Path.cwd().parent)

# 判断路径是否存在
print(Path('.').exists())

# 拼接路径
print(Path('.') / 'test.py')

`shutil`

shutil模块提供了一些用于文件和目录操作的函数。

import shutil

# 创建目录
os.mkdir('test_dir')

# 创建文件
Path('test_dir/test.py').touch()

# 复制文件
shutil.copytree('test_dir', 'test_dir_copy')

# 移动文件
shutil.move('test_dir', 'test_dir_move')

# 删除文件
shutil.rmtree('test_dir_move')

`logging`

logging模块用于记录日志。

import logging

logging.basicConfig(level=logging.DEBUG)
logging.debug('This is a debug message')
logging.info('This is an info message')
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical message')

`argparse`

argparse模块用于解析命令行参数。

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--name', help='your name')
args = parser.parse_args()
print(f'Hello, {args.name}')

执行时，追加--name参数，例如：python3 main.py --name hboot

`re`

re模块用于正则表达式匹配。

import re

# 匹配字符串
print(re.match('hello', 'hello world'))

# 提取邮箱
print(re.findall(r'[\w]+@[\w]+\.[\w]+', 'hello world, my email is <bobolity@163.com>'))

# 提取手机号
print(re.findall(r'1[3-9]\d{9}', 'hello world, my phone number is 13812345678'))

# 替换字符串
print(re.sub(r'[\d]+', '*', 'hello world, my phone number is 13812345678'))

`datetime`

datetime模块用于处理日期和时间。

from datetime import datetime

# 获取当前时间
print(datetime.now())

# 指定时间获取对象
print(datetime(2020, 1, 1))

# 指定时间戳获取对象
print(datetime.fromtimestamp(1577836800))

# 解析字符串为时间对象
print(datetime.strptime('2020-01-01', '%Y-%m-%d'))

# 格式化时间为字符串
print(datetime.now().strftime('%Y-%m-%d'))

`collections`

collections 增强型数据结构，用来创建自定义数据结构。Counter 统计元素出现的次数；defaultdict 创建一个默认值为 None 的字典。

from collections import Counter, defaultdict

print(Counter([1, 1, 2, 3, 3, 3, 4, 4, 4, 4]))
print(Counter('hello world'))

print(defaultdict(lambda: 'N/A'))
print(defaultdict(list))

`random`

random 模块用来生成随机数。

import random

# 生成随机数
print(random.random())

# 生成指定范围内的随机数
print(random.randint(0, 10))

# 随机选择一个元素
print(random.choice([1, 2, 3, 4, 5]))

# 随机选择指定数量的元素
print(random.sample([1, 2, 3, 4, 5], 3))

# 打乱列表
list = [1, 2, 3, 4, 5]
random.shuffle(list)
print(list)

`math`

math 模块用来进行数学运算。

import math

# 获取圆周率
print(math.pi)

# 获取正弦值
print(math.sin(math.pi / 2))

# 获取自然对数
print(math.log(math.e))

# 获取对数
print(math.log(100, 10))

# 获取指数
print(math.exp(1))

# 获取绝对值
print(math.fabs(-100))

# 获取平方根
print(math.sqrt(16))

`glob`

glob 模块用来匹配文件。

import glob

# 获取所有文件
print(glob.glob('*.py'))

`zipfile`

zipfile 模块用来操作 ZIP 文件。

import zipfile

Path('test.py').touch()
Path('test.txt').touch()

# 创建 ZIP 文件
with zipfile.ZipFile('test.zip', 'w') as zip_file:
    zip_file.write('test.py')
    zip_file.write('test.txt')

print(zipfile.is_zipfile('test.zip'))

# 读取 ZIP 文件
with zipfile.ZipFile('test.zip', 'r') as zip_file:
    print(zip_file.namelist())

`itertools`

itertools 高效迭代工具,处理复杂循环、排列组合。

from itertools import product, permutations

# 笛卡尔积
for i in product([1, 2], [3, 4]):
    print(i)

# 排列组合
for i in permutations([1, 2, 3], 2):
    print(i)

`Jupyter lab`

Jupyter lab 是一个基于 Jupyter Notebook 的开发环境，提供了丰富的功能，如代码高亮、自动补全、单元测试、图表绘制、数据可视化等等。

# 安装
pip install jupyterlab

# 启动
jupyter lab

启动后在浏览器会有一个编辑的界面，选择一个环境，会生成一个文件.ipynb，在文件里可以输入代码，并运行。

shift + enter 运行当前行。

如果需要安装依赖，可以直接在代码里使用 !pip install xxx 安装。

如果需要读取当前项目的python文件，可以使用 %load xxx.py。如果需要执行某个件，可以使用 %run xxx.py。

可以说是学习Python的神器啊，比在本地编写运行更方便。

实现一个CLI应用

写一个 CLI 工具，读取 JSON 文件 → 调 OpenAI API 做文本翻译 → 写回文件。

需求很明确，首先获取到执行CLI的参数。

import sys

# 获取到json 文件路径，读取文件
json_file = sys.argv[1]

获取到指定的json 文件路径，读取文件。

import json

# 读取文件
with open(json_file, "r") as f:
    data = json.load(f)

调用大模型，我使用的DeepSeek,参照官网提供的Python 调用示例。

import os
from openai import OpenAI

api_key = os.getenv("API_KEY")
base_url = os.getenv("BASE_URL")
client = OpenAI(api_key=api_key, base_url=base_url)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant. Return the text into English.",
        },
        {"role": "user", "content": data},
    ],
    stream=False,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}},
)

print(response.choices[0].message.content)

得到翻译结果，写入文件。

import json

with open(json_file, "w") as f:
    json.dump(
        {"original": data, "translated": translated_text},
        f,
        ensure_ascii=False,
        indent=4,
    )

使用dotenv 解析.env 文件。获取到API_KEY 和 BASE_URL。

from dotenv import load_dotenv

# 加载环境变量
load_dotenv()

调用测试python3 main.py test.json , 等待大模型响应，执行结束后可以看到文件test.json被写入了翻译结果。

代码优化点：

添加错误处理，使用try...except，如 API 密钥错误、网络错误等。
提前检测参数，如参数不足、参数格式错误等。
使用临时文件，保证文件写入安全。
if __name__ == "__main__" 保护。

AI工程师第一课 - Python

安装Pyhon

版本管理pyenv

虚拟环境venv

包管理器pip

写出我的Hello World

基础知识回顾

数据类型、变量

函数

类Class 和对象Object

条件判断 和循环语句

文件操作I/O

模块导入

错误捕获

核心知识进阶

类型注解

生成器

上下文管理

魔法Pythonic

异步编程

常用标准库

json

os

sys

pathlib

shutil

logging

argparse

re

datetime

collections

random

math

glob

zipfile

itertools

Jupyter lab