本文已参与「新人创作礼」活动，一起开启掘金创作之路。

🌴 2022.05.07 下午

前言

🎬本文章是 【Python语言基础】 专栏的文章，主要是上课的随堂笔记与练习
🔗Python专栏传送门
💻提示：本文的环境配置是Python3的开发环境。目录顺序以当堂内容(教材)为准，所以并非“1、2、3”
📽本节主要内容：Python中模块使python的功能变得十分强大，本节介绍几个常用模块和库

7 模块包库

7.1 前言

模块(Module)：一个以.py 结尾的Python文件，包含了 Python 对象和语句

包(Package)：Python模块文件所在的目录，并且在该目录下必须有一个名为_init_.py的文件

库(Library)：具有相关功能的包和模块集合，如Python标准库、NumPy库等

7.2 常用标准库模块

7.2.1 Turtle

Turtle是Python内嵌的绘制线、圆及其他形状(包括文本)的图形模块

1．画布(Canvas) 画布是Turtle模块展开用于绘图的区域

使用turtle.screensize()函数设置画布的一般格式为： turtle.screensize(width,height,bg) ，例如，turtle.screensize(600, 400, "black")设置画布的宽为600、高为400、背景颜色为黑色
使用turtle. setup()函数设置画布的一般格式为：turtle.setup(width,height,startx,starty)，例如，turtle.setup(width=800, height=600, startx=100, starty=100)设置画布宽和高分别为800和600，画布左上角顶点在窗口的坐标位置为(100,100)

2．画笔 (1) 画笔状态。Turtle模块绘图使用位置方向描述画笔的状态 (2) 画笔属性。画笔的属性包括画笔的颜色、宽度和移动速度等

turtle.pensize(width)：设置画笔宽度width，数字越大，画笔越宽
turtle.pencolor(color)：设置画笔颜色color，可以是字符串如"red"、"yellow"，或RGB格式
turtle.speed(speed)：设置画笔移动速度speed，范围为[0,10]的整数，数字越大，画笔移动的速度越快

(3)绘图命令：操纵Turtle模块绘图有许多命令，通过相应函数完成。绘图命令通常分为三类：画笔运动命令、画笔控制命令和全局控制命令

使用Turtle模块绘制一个圆和一个填充的正方形

import turtle                #导入模块
turtle.penup()                                         
turtle.goto(-150,0)                                      
turtle.pendown()                                      
turtle.pencolor('blue')      #画笔颜色为蓝色
turtle.begin_fill()                                      
turtle.fillcolor('blue')     #填充颜色为蓝色
for i in range(4):
  turtle.forward(100)        #画笔向当前方向移动距离100 
  turtle.left(90)            #画笔逆时针旋转90°                 
turtle.end_fill() 
#画圆
turtle.penup()
turtle.goto(100,0)           #将画笔移动到指定的绝对坐标位置(100,0)
turtle.pendown()
turtle.color('red')          #画笔颜色为红色
turtle.pensize(3)            #画笔宽度为3.
turtle.circle(50)            #圆的半径为50
turtle.done()                #使绘图容器不消失

使用Turtle模块在画布上写文字

import turtle                                            
t = turtle.Turtle()              #创建turtle对象
t.penup()
t.goto(-80,20)
t.write("望庐山瀑布",font=("微软雅黑",14,"normal"))	 #设置字体、大小、加粗
t.sety(-10)                    	 #画笔向下移动到-10
t.write("日照香炉生紫烟",font=("微软雅黑",14,"normal"))
t.sety(-40)                      #画笔向下移动到-40
t.write("遥看瀑布挂前川",font=("微软雅黑",14,"normal"))
t.sety(-70)                    	#画笔向下移动
t.write("飞流直下三千尺",font=("微软雅黑",14,"normal"))
t.sety(-100)                   	#画笔向下移动
t.write("疑是银河落九天",font=("微软雅黑",14,"normal"))
t.hideturtle()                                               
turtle.done()

7.2.2 Random

🚀 random.random()函数

random.random()函数用于生成一个[0, 1)之间的随机浮点数，其一般格式为：random.random()

使用random.random()函数生成5个[0, 1)之间的随机浮点数

import random                                              
for x in range(1,6):                                                
  print(random.random())
'''
0.43511825592243447
0.7599585286688377
0.511071683639099
0.9829050908694336
0.07342341214429426
'''

🚀 random.uniform()函数

random.uniform()函数用于生成一个指定范围内的随机符点数，其一般格式为：random.uniform(a,b)

使用random.uniform()函数生成指定范围的随机浮点数

import random                    
print(random.uniform(3,6))  		
print(random.uniform(8,6))  		
print(random.uniform(-1,1))  		 		
'''
[61, 25, 40, 75, 7]
'''

🚀 random.randrange()函数

random.randrange()函数用于生成指定范围、指定步长的随机整数，其一般格式为：random.randrange([start],stop[,step])

使用random.randrange()函数随机生成10个1～100范围的奇数添加到列表中

import random                               
list1 = []                                    
for x in range(1,11):                        
  list1.append(random.randrange(1,100,2))	
print(list1) 		
'''
[41, 81, 3, 71, 59, 75, 89, 31, 23]
'''

🚀 random.choice()函数

random.choice()函数的功能是从序列对象中获取一个随机元素，其一般格式为：random.choice(sequence)

使用random.choice()函数从列表中随机获取一个元素

import random                       
list1 = [1,2,3,4,5,6,7,8]         
for x in range(1,4):                  
  r = random.choice(list1) 		
  print("r =",r) 
'''
r = 4
r = 8
r = 3
'''

🚀 random.shuffle()函数

random.shuffle()函数用于将一个序列对象中的元素打乱，其一般格式为： random.shuffle(sequence[,random])

import random                    
list1 = [1,2,3,4,5,6,7,8]     
for x in range(1,4):               
  random.shuffle(list1)           	
  print(list1) 
'''
[7, 5, 1, 3, 8, 2, 6, 4]
[6, 3, 5, 7, 1, 4, 2, 8]
[2, 5, 4, 7, 1, 3, 6, 8]
'''

🚀 random.sample()函数

random.sample()函数从指定序列对象中随机获取指定长度的片段，其一般格式为：random.sample(sequence,k)

使用random.sample()函数从列表中随机选择若干元素形成一个新列表

import random                     
list1 = [1,2,3,4,5,6,7,8]      
slice1 = random.sample(list1,4)	
print("slice1:",slice1) 		
'''
slice1: [4, 3, 5, 1]
'''

7.2.3 Time & Datetime

🚀 Time模块

Time模块主要用于时间访问和转换，提供了各种与时间相关的函数

import time                                                      
print("时间戳格式时间:",time.time())	# 时间戳格式时间: 1537320859.5078118
print("struct_time格式时间:",time.localtime(time.time()))	#struct_time格式时间: time.struct_time(tm_year=2018, tm_mon=9, tm_mday=19,
tm_hour=9, tm_min=34, tm_sec=19, tm_wday=2, tm_yday=262, tm_isdst=0)
print("字符串格式时间:",time.ctime())		# 字符串格式时间: Wed Sep 19 09:34:19 2022
print("字符串格式时间:",time.asctime())	# 字符串格式时间: Wed Sep 19 09:34:19 2022
print(time.strftime('%Y-%m-%d %H:%M',time.localtime()))  	# 2022-09-19 09:34

🚀 Datetime模块

date类为日期类。创建一个date对象的一般格式为：d = datetime.date(year,month,day)

from datetime import date
d = date.today()
print("当前本地日期:",d)
print("日期: %d 年 %d 月 %d日."%(d.year,d.month,d.day))
print("今天是周 %d."%d.isoweekday())  
'''
当前本地日期:2022-05-07
日期: 2022 年 5 月 7日.
今天是周 6.
'''

Time类为时间类。创建一个time对象的一般格式为：t = time(hour,[minute[,second,[microsecond[,tzinfo]]]])

from datetime import time               
print("时间最大值:",time.max)           
print("时间最小值:",time.min)            
t = time(20,30,50,8888)       #创建time对象
print("时间: %d时%d分%d秒%d微秒."%(t.hour,t.minute,t.second,t.microsecond))
'''
时间最大值: 23:59:59.999999
时间最小值: 00:00:00
时间: 20时30分50秒8888微秒
'''

Datetime类为日期时间类。创建一个datetime对象的一般格式为： dt = datetime(year,month,day,hour,minute,second,microsecond,tzinfo)

from datetime import datetime                    
dt = datetime.now()                                           
print("当前日期:",dt.date())                        
print("当前时间:",dt.time())                       
print("当前年份: %d,当前月份: %d,当前日期: %d."%(dt.year,dt.month,dt.day))
print("时间:",datetime(2018,9,16,12,20,36))   	
'''
当前日期: 2022-05-07
当前时间: 14:38:59.054000
当前年份: 2022, 当前月份: 5, 当前日期: 07.
时间: 2022-05-07 14:40:36
'''

timedelta对象表示两个不同时间之间的差值。 td = datetime.timedelta(days,seconds,microseconds,milliseconds,hours,weeks)

from datetime import datetime,timedelta                
print("1周包含的总秒数：",timedelta(days=7).total_seconds())  
d = datetime.now()
print("当前本地系统时间:",d)
print("1天后:",d + timedelta(days=1))                        
print("1天前:",d + timedelta(days=-1))                       
'''
1周包含的总秒数: 604800.0
当前本地系统时间: 2022-05-07 14:48:23.400000
1天后: 2022-05-07 14:48:23.400000
1天前: 2022-05-07 14:48:23.400000
'''

7.2.4 Os

🚀 获取平台信息

使用Os模块的一些属性和方法可以获取系统平台的相关信息

os.getcwd()：获取当前工作目录
os.sep：查看操作系统特定的路径分隔符
os.linesep：查看当前平台使用的行终结符
os.pathsep：查看用于分割文件路径的字符串
os.name：查看当前系统平台
os.environ：查看当前系统的环境变量

使用Os模块获取系统相关信息

import os                                
print("分隔符:",os.sep)                    
print("操作系统平台:",os.name)           
print("环境变量path:",os.getenv('path'))

🚀 目录、文件操作

os.mkdir(newdir)：创建新目录newdir
os.rmdir(dir)：删除目录dir
os.listdir(path)：列出指定目录path下所有文件
os.chdir(path)：改变当前脚本的工作目录为指定路径path
os.remove(file)：删除一个指定文件file
os.rename(oldnam,newname)：重命名一个文件

使用Os模块对目录、文件进行操作

import os                                                    
print("当前工作路径:",os.getcwd())                              
print("当前路径的目录和文件列表:",os.listdir())                    
os.rename("test1.py","test2.py")                                
print("重命名文件后,当前路径的目录和文件列表:",os.listdir())      
os.mkdir("newDir")                                           
print("创建新目录后,当前路径的目录和文件列表:",os.listdir())      
os.chdir("newDir")                                            
print("改变当前工作路径后, 当前工作路径:",os.getcwd())

使用os.path模块获取文件属性

import os                                                     
print("(路径,文件):",os.path.split(r"d:\Python\test\test1.py"))   
print("目录存在?:",os.path.exists(r"d:\Python\test"))           
print("文件存在?:",os.path.isfile(r"d:\Python\test\test1.py"))    
print("文件大小:",os.path.getsize(r"d:\Python\test\test1.py"))

🚀 调用系统命令

os.popen(cmd[, mode[, bufsize]])：用于由一个命令打开一个管道
os.system(shell)：运行shell命令

使用Os模块中的函数调用系统命令

import os                                             
os.system("mkdir d:\\newDir")              	
os.popen(r"c:\windows\notepad.exe")       	#打开记事本程序.
print("程序运行成功!")
'''
程序运行成功!
'''

7.2.5 Sys

使用Sys模块获取系统信息

import sys                          
print("参数:",sys.argv)          # 参数: ['d:/pythonProjects/test1.py']
print("Python版本:",sys.version) # Python版本: 3.7.2 (v3.7.2:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)]
print("操作系统:",sys.platform)   # 操作系统: win32
print("最大Int值:",sys.maxsize)   # 最大Int值: 9223372036854775807
sys.exit(0)

7.2.6 Timeit

Timeit模块是一个具有计时功能的模块，常用于测试一段代码的运行时

Timeit模块常用的函数有timeit()和repeat()函数

timeit()函数返回执行代码所用的时间，单位为秒，其一般格式为：t = timeit(stmt='code',setup='code',timer=<defaulttimer>,number=n)

repeat()函数比timeit()函数多了一个repeat参数，表示重复执行指定代码这个过程多少遍，返回一个列表表示执行每遍的时间；其一般格式为：t = repeat(stmt='code',setup='code',timer=<defaulttimer>,repeat=m,number=n)

测试函数myFun()中代码的执行时间

import timeit
def myFun():
  sum = 0
  for i in range(1,100):
    for j in range(1,100):
      sum = sum + i * j
t1 = timeit.timeit(stmt=myFun,number=1000)            
print("t1:",t1)
t2 = timeit.repeat(stmt=myFun,number=1000, repeat=6) 
print("t2:",t2)

7.2.7 Zlib

使用Zlib模块对字符串进行压缩和解压缩

import zlib                                     
str = b'What is your name? What is your name? What is your name?'
print("压缩前: %s,字符个数%d."%(str,len(str)))
str_com = zlib.compress(str)            		     
print("压缩后: %s, 字符个数%d."%(str_com,len(str_com)))
str_dec = zlib.decompress(str_com)      		
print("解压后: %s, 字符个数%d."%(str_dec,len(str_dec)))
'''
运行结果：
压缩前: b'What is your name?What is your name? What is your name?', 字符个数56.
压缩后: b'x\x9c\x0b\xcfH,Q\xc8,V\xa8\xcc/- 
                R\xc8K\xccM\xb5\x0f\xc7\x10Q\xc0\
	     x14\x02\x00(\x11\ x13\x9e', 字符个数30.
解压后: b'What is your name?What is your name? What is your name?', 字符个数56.
'''

7.3 第三方库

7.3.1 NumPy

NumPy是基于Python的一种开源数值计算第三方库，它支持高维数组运算、大型矩阵处理、矢量运算、线性代数运算、随机数生成等功能

🚀 数组

NumPy库中的ndarray是一个多维数组对象。该对象由两部分组成：实际的数据和描述这些数据的元数据。和Python中的列表、元组一样，NumPy数组的下标也是从0开始

🚁 创建数组

在NumPy库中，创建数组可使用np.array()函数，其一般格式为：numpy.array(object,dtype=None,copy=True,order=None,subok=False,ndmin=0)

object为数组或嵌套的数列
dtype为数组元素的数据类型
copy指定对象是否需要arange复制
order为创建数组的样式，C为行方向，F为列方向，A为任意方向(默认)
subok指定默认返回一个与基类类型一致的数组
ndmin为指定生成数组的最小维度

创建数组

import numpy as np
np.array([1,2,3,4,5,6])       				#一维数组
#array([1, 2, 3, 4, 5, 6])
np.array([1,2,3,4,5,6]).reshape(2,3)     	#二维数组
#array([[1, 2, 3],[4, 5, 6]])
np.array([[1,2,3],[4,5,6]])     			#二维数组
#array([[1, 2, 3],[4, 5, 6]])
np.array([1,3,5],dtype=complex)          	#指定数据类型为复数
#array([1.+0.j, 3.+0.j, 5.+0.j])
np.array([2,4,6],ndmin=2)    				#指定最小维度
#array([[2, 4, 6]])

创建特定数组

import numpy as np
np.arange(6)
#array([0, 1, 2, 3, 4, 5])
np.arange(6, dtype=float)
#array([0., 1., 2., 3., 4., 5.])
np.arange(1,10,2)
#array([1, 3, 5, 7, 9])
np.linspace(1,10,10)
#array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
np.logspace(0,9,10,base=2)
#array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128., 256., 512.])
np.zeros((2,2))
#array([[0., 0.],[0., 0.]])
np.ones([2,3])
#array([[1., 1., 1.],[1., 1., 1.]])

🚁 数组索引和切片

import numpy as np
a = np.arange(10)
a[5]	#5
a[1:6:2]	#array([1, 3, 5])
b = np.array([[1,2,3],[4,5,6],[7,8,9]])
b[2,2]	#9
b[1:]	#array([[4, 5, 6],[7, 8, 9]])

查看数组属性

import numpy as np
a = np.arange(24).reshape(2,3,4)
a.ndim	# 3
a.shape	# (2, 3, 4)
a.size	# 24
a.dtype	# dtype('int32')
a.itemsize	# 4

数组操作

import numpy as np
a = np.arange(8)
a.reshape(2,4)      #改变数组形状
# array([[0, 1, 2, 3],
#       [4, 5, 6, 7]])
np.transpose(a.reshape(2,4))       
#array([[0, 4],
#       [1, 5],
#       [2, 6],
#       [3, 7]])
a.reshape(2,4).ravel()       # array([0, 1, 2, 3, 4, 5, 6, 7])
for element in a.flat: 
    print(element, end=" ")		# 0 1 2 3 4 5 6 7

数组运算

import numpy as np
a = np.array([[1,2],[3,4]])
a * 2                           	#数组与数相乘
#array([[2, 4],                                               
#       [6, 8]])
b = np.array([[5,6],[7,8]])
a + b                          		#两个数组相加
#array([[ 6,  8],                                            
#       [10, 12]])
np.dot(a,b)                    		#两个数组的内积
#array([[19, 22],                                             
#       [43, 50]])
np.matmul(a,b)                		 #两个数组的矩阵乘法
#array([[19, 22],
#       [43, 50]])
#上述结果计算方法: 1*5+2*7=19, 1*6+2*8=22, 3*5+4*7=43, 3*6+4*8=50
np.vdot(a,b)                   		  #两个数组的点积
#370
#上述结果计算方法: 1*5+2*6+3*7+4*8=70
np.inner(a,b)                   		#两个数组的向量内积
#array([[17, 23],
#       [39, 53]])
#上述结果计算方法: 1*5+2*6=17, 1*7+2*8=23, 3*5+4*6=39, 3*7+4*8=53
np.linalg.det(a)                 	  #求矩阵行列式
#-2.0000000000000004
np.linalg.inv(a)                 	 #求逆矩阵
#array([[-2. ,  1. ],
#       [ 1.5, -0.5]])

🚀 矩阵

在NumPy中，通常使用mat()函数或matrix()函数创建矩阵，也可以通过矩阵的转置、逆矩阵等方法来创建矩阵

创建矩阵

import numpy as np
A = np.mat("3 4;5 6")  
>>> A
[[3 4]
 [5 6]]
>>> A.T
matrix([[3, 5]
        [4, 6]])
>>> A.I                       	
matrix([[-3. ,  2. ],
        [ 2.5, -1.5]])
>>> np.mat(np.arange(9).reshape(3,3))     
matrix([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])

矩阵运算

import numpy as np
A = np.mat('1, 2; 3, 4')                                       
A * 2                           		#矩阵和数相乘
matrix([[2, 4],
        [6, 8]])
B = np.mat('5, 6; 7, 8')
A + B                         		   	#两个矩阵相加
#matrix([[ 6,  8],
#        [10, 12]])
A.dot(B)                       		   	#两个矩阵点积
#matrix([[19, 22],
#        [43, 50]])
np.matmul(A,B)                			#两个矩阵相乘
#matrix([[19, 22],
#        [43, 50]])
np.inner(A,B)                  	        #两个矩阵内积
#matrix([[17, 23],
#        [39, 53]])
np.linalg.inv(A)                 	    #逆矩阵
#matrix([[-2. ,  1. ],
#        [ 1.5, -0.5]])
np.linalg.det(A)                 	  	#求矩阵的行列式
#-2.0000000000000004

7.3.2 Pandas

Pandas是基于NumPy库的一种解决数据分析任务的工具库

Pandas库的主要功能有：创建Series(系列)和DataFrame(数据帧)、索引选取和过滤、算术运算、数据汇总和描述性统计、数据排序和排名、处理缺失值和层次化索引等

Pandas库下载网址：pypi.org/project/pan…

🚀 系列(Series)

系列与NumPy库中的一维数组(array)类似，能保存字符串、Bool值、数字等不同的数据类型

创建一个系列的一般格式为：pandas.Series(data,index,dtype,copy)

data：数据，采取各种形式，如ndarray、list、constants等
index：索引值，必须是唯一的和散列的
dtype：数据类型
copy：复制数据，默认为False

import pandas as pd                          
import numpy as np                         
data = np.array(['需求分析','概要设计','详细设计','编制代码','运行维护'])
s = pd.Series(data)      
print(s)
'''
0    需求分析
1    概要设计
2    详细设计
3    编制代码
4    运行维护
dtype: object
'''

从字典创建一个系列

import pandas as pd            
data = {'A':"优秀",'B':"良好",'C':"合格",'D':"不合格"}
s = pd.Series(data)
print(s)                            
print("s[0]:",s[0])                  
'''
A     优秀
B     良好
C     合格
D    不合格
dtype: object
s[0]: 优秀
'''

🚀 数据帧(DataFrame)

数据帧是二维的表格型数据结构，即数据以行和列的表格方式排列。与系列相比，数据帧使用得更普遍

创建一个数据帧的一般格式为：pandas.DataFrame(data,index,columns,dtype,copy)

data：数据，可以是各种类型，如ndarray、series、lists、dict、 constant或DataFrame等
index，columns：分别为行标签和列标签
dtype：每列的数据类型
copy：复制数据，默认值为False

从列表创建DataFrame

import pandas as pd                                  
data = [['Tom',3],['Jerry',1]]
df = pd.DataFrame(data,columns = ['Name','Age'])
print(df)                                   
'''
      Name    Age
0    Tom      3
1    Jerry    1
'''

DataFrame的创建和访问

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(9).reshape((3,3)),index=['A','B','C'],
columns=['one','two','three'])
>>> df                    #数据帧
'''
   one  two  three
A    0    1      2
B    3    4      5
C    6    7      8
'''
>>> df[1:2]               #选取行数据
'''
   one  two  three
B    3    4      5
'''
>>> df[['three','one']]    #选取列数据
'''
   three  one
A      2    0
B      5    3
C      8    6
'''
>>> df[df['three'] > 5]    #数据过滤
'''
   one  two  three
C    6    7      8
'''
>>> df.loc['A','two']      	#使用.loc()选取单个数据
#1
>>> df.iloc[1,1]            #使用.iloc()选取单个数据.
#4

7.3.3 SciPy

SciPy库是一款方便、易于使用、专为科学和工程设计的工具库，包括统计、优化、整合、线性代数、傅里叶变换、信号和图像处理、常微分方程求解等

🚀 SciPy库的使用

SciPy库中的模块很多，不同模块的功能相对独立：

scipy.constants(数学常量)
scipy.fftpack(快速傅里叶变换)
scipy.integrate(积分)
scipy.optimize(优化算法)
scipy.stats(统计函数)
scipy.special(特殊数学函数)
scipy.signal(信号处理)
scipy.ndimage(N维图像)

🚁 constants模块

查看constants模块中常用数学常量

from scipy import constants as con       
>>> con.hour                 		#1小时对应的秒数
3600.0
>>> con.c                    		 #真空中的光速
299792458.0
>>> con.inch                 		#1英寸对应的米数
0.0254
>>> con.degree               	    #1°等对应的弧度数
0.017453292519943295
>>> con.golden                      #黄金比例
1.618033988749895

🚁 special模块

使用special模块完成特殊数学函数功能

>>> from scipy import special as sp        
>>> sp.cbrt(27)         		#求立方根
3.0
>>> sp.sindg(45)        		#正弦函数，参数为角度
0.7071067811865476
>>> sp.comb(6,3)       			#6中选3的组合数
20.0
>>> sp.perm(5,3)       			#5中选3的排列数
60.0
>>> sp.round(5.67)              #返回四舍五入后的整数
6.0

🚁 scipy.linalg模块

计算方阵的行列式和逆矩阵

import numpy as np                            
from scipy import linalg                         
mat = np.array([[5,6],[7,8]])
print("方阵:",mat)
print("方阵的行列式:%6.2f."%linalg.det(mat))
print("方阵的逆矩阵:",linalg.inv(mat))

信号处理模块signal

import numpy as np
import scipy.signal
x = np.array([3,4,5])
h = np.array([6,7,8])
nn = scipy.signal.convolve(x,h)     		#一维卷积运算
print("nn:",nn)
'''
nn: [18 45 82 67 40]
'''

7.3.4 Matplotlib

Matplotlib是一个基于Python、跨平台、交互式的2D绘图库，以各种硬拷贝格式生成出版质量级别的图形

🚀 Matplotlib库的使用

使用plot()函数绘制图形并设置坐标轴

import matplotlib.pyplot as plt
x = [1,2,3,4,5,6,7,8]           #创建x轴数据
y = [3,5,6,9,13,6,32,111]		#计算y轴数据
plt.xlim((0,10))         		#设置x轴刻度范围
plt.ylim((0,120))          		#设置y轴刻度范围
plt.xlabel('x轴',fontproperties='SimHei',fontsize=16)    #设置x轴字体
plt.ylabel('y轴',fontproperties='SimHei',fontsize=16)    #设置y轴字体
plt.plot(x,y,'r',lw=2)  	    #(x、y):坐标，'r':红色，lw:线宽
plt.show()              		#显示图形

Python语言基础16

使用figure()函数画绘制多幅图形

import numpy as np  
import matplotlib.pyplot as plt                       
x = np.linspace(-1,1,50)    	#生成50个从-1到1范围内均匀的数
#figure 1
y1 = 3 * x – 1   				#计算y1
plt.figure()
plt.plot(x,y1,'r')   		    #绘图
#figure 2
y2 = x ** 2       				#计算y2
plt.figure()
plt.plot(x,y2,'b')   		    #绘图
plt.show()

Python语言基础17

使用matplotlib.pyplot绘图并设置图例

import matplotlib.pyplot as plt              
import numpy as np                      
x = np.arange(1,20,1)                 	              
plt.plot(x,x ** 2 + 1,'red',lw=2)        	     
plt.plot(x,x * 16,'b',linestyle=’dashed’,lw=2)    
plt.legend(['x**2', '16*x'])     	         #设置图例
plt.show()

Python语言基础18

使用scatter ()函数绘制散点图

import numpy as np, matplotlib.pyplot as plt                     
n = 512               				#数据个数
x = np.random.normal(0,1,n)         #均值为0, 方差为1的随机数
y = np.random.normal(0,1,n)         #均值为0, 方差为1的随机数
color = np.arctan2(y,x) 			#计算颜色值
plt.scatter(x,y,s=75,c=color,alpha=0.6)     	#绘制散点图
plt.xlim((-2.0,2.0))                               
plt.ylim((-2.0,2.0))                               
plt.show()

Python语言基础19

使用subplot()函数绘制多个子图

import matplotlib.pyplot as plt            
plt.figure()                                                
plt.subplot(2,2,1)           		  #第1个子图
plt.plot([0,1,2],[1,2,3],'r')             
plt.subplot(2,2,2)           		  #第2个子图
plt.plot([0,1,2],[1,1,4],'b')             
plt.subplot(2,2,3)           		  #第3个子图
plt.plot([0,1,2],[1,2,8],'g')             
plt.subplot(2,2,4)           		  #第4个子图
plt.plot([0,1,2],[1,3,16],'y')           
plt.show()

Python语言基础20

7.3.5 Jieba

Jieba库支持三种分词模式

精确模式：把文本精确地切分开，不存在冗余单词
全模式：把文本中所有可能的词语都扫描出来，存在冗余
搜索引擎模式：在精确模式的基础上，对长词再次切分，存在冗余
Jiaba分词还支持繁体分词和自定义分词

在命令行界面中执行命令pip install jieba，下载库

🚀 Jieba库的使用

🚁 分词

可使用方法 jieba.cut()和jieba.cut_for_search()对中文字符串进行分词

string：需要分词的中文字符串，编码格式为Unicode、UTF-8或GBK
cut_all：是否使用全模式，默认值为 False
HMM：是否使用 HMM 模型，默认值为 True。

方法jieba.cut_for_search()和jieba.lcut_for_search()接收2个参数

string：需要分词的中文字符串，编码格式为Unicode、UTF-8或GBK
HMM：是否使用HMM模型，默认值为 True

import jieba                                   
segList1 = jieba.cut("居里夫人1903年获诺贝尔奖时做了精彩演讲",cut_all=True)
print("全模式:","/".join(segList1))                 
segList2 = jieba.cut("居里夫人1903年获诺贝尔奖时做了精彩演讲",cut_all=False)
print("精确模式:","/".join(segList2))              
segList3 = jieba.cut("居里夫人1903年获诺贝尔奖时做了精彩演讲",cut_all=False)
print("搜索引擎模式:",".".join(segList3))           
'''
全模式: 居里/居里夫人/里夫/夫人/1903/年/获/诺贝/诺贝尔/诺贝尔奖/贝尔/奖/时/做/了/精彩/演讲
精确模式: 居里夫人/1903/年/获/诺贝尔奖/时做/了/精彩/演讲
搜索引擎模式: 居里夫人.1903.年.获.诺贝尔奖.时做.了.精彩.演讲
'''

🚁 关键词提取

Jieba库采用“词频－逆向文件频率” 算法进行关键词抽取。jieba.analyse.extract_tags(sentence,topK=20,withWeight=False,allowPOS=())

sentence为待提取的文本
topK为返回若干个TF/IDF权重最大的关键词，默认值为20
withWeight为是否返回关键词权重值，默认值为False
allowPOS指定仅包括指定词性的词，默认值为空，即不筛选

使用Jieba库提取中文字符串中的关键词

import jieba                              
import jieba.analyse                       
sentence = "艾萨克·牛顿(1643年1月4日—1727年3月31日)爵士，\
   英国皇家学会会长，英国著名的物理学家，百科全书式的“全才”，\
   著有《自然哲学的数学原理》《光学》。"
#关键词提取.
keywords = jieba.analyse.extract_tags(sentence,topK=20,withWeight=True,
allowPOS=('n','nr','ns'))
for item in keywords:
  print(item[0],item[1])
'''
艾萨克 1.5364049674375
数学原理 1.321059142725
爵士 1.13206132069
牛顿 1.03458251822375
会长 0.97365128905875
物理学家 0.97365128905875
光学 0.937137931755
英国 0.62829620167375
'''

🚁 词性标注

Jieba库支持创建自定义分词器，方法如下：jieba.posseg.POSTokenizer(tokenizer=None) ，tokenizer指定内部使用的jieba.Tokenizer分词器，jieba.posseg.dt 为默认词性标注分词器

import jieba.posseg as pseg                
words = pseg.cut("中国人民是不可战胜的") 	
for word,flag in words:
print('%s %s' % (word,flag))

7.3.6 PyInstaller

Pyinstaller库可以用来打包Python应用程序。打包时，Pyinstaller库会扫描Python程序的所有文档，分析所有代码找出代码运行所需的模块，然后将所有这些模块和代码放在一个文件夹里或一个可执行文件里。用户就不用下载各种软件运行环境，如各种版本的Python和各种不同的包，只需要执行打包好的可执行文件就可以使用软件了

下载与安装：在命令行界面中执行命令pip install pyinstaller

🚀 打包Python程序

创建一个Python源文件test1.py

import random                  
list1 = [1,2,3,4,5,6,7,8]   
slice1 = random.sample(list1,4)  
print("list1:",list1)              
print("slice1:",slice1)            
input()                   	       #保持运行结果显示

打开命令行界面，进入源文件test1.py所在路径

在命令行界面中运行命令pyinstaller-F test1.py打包源文件

成功执行命令后，生成的可执行文件test1.exe在源文件test1.py所在路径的文件夹dist

7.4 自定义模块

Python语言基础21

🚀 [场景1] 在源文件A11.py中调用包pack1中的模块A12

在本场景中，源文件A11.py和模块A12在同一路径。实现步骤为：

在pack1文件夹下添加文件__init__.py
分别编写源文件A11.py和模块A12中的程序代码

#模块A12中的程序代码
#定义函数.
def func_A12():
  return 'A12 in Pack1'	#模块A12就是名为A12.py的源文件

#方法一	源文件A11.py中的程序代码
import A12　                                      
print(A12.func_A12())    		#调用函数A12.func_A12()
A11.py的运行结果：
A12 in Pack1

#方法二	源文件A11.py中的程序代码
from A12 import *              
print(func_A12())        		#调用函数func_A12()
A11.py的运行结果：
A12 in Pack1

#方法三	源文件A11.py中的程序代码
from A12 import func_A12       
print(func_A12())        		#调用函数func_A12()
A11.py的运行结果：
A12 in Pack1

#方法四	源文件A11.py中的程序代码
import A12 as a         		#给模块A12取别名为a
print(a.func_A12())      		#调用函数a.func_A12()
A11.py的运行结果：
A12 in Pack1.

🚀 [场景2] 在源文件main.py中调用包pack2中的模块A2

本场景中，源文件main.py和模块A2所在的包pack2在同一路径。实现步骤为：

在pack2文件夹下添加文件__init__.py
分别编写模块A2和源文件main.py中的程序代码

#模块A2中的程序代码
#定义函数
def func_A2():
  return 'A2 in Pack2'

#方法一	源文件main.py中的程序代码
from pack2.A2 import *  	#或from pack2.A2 import func_A2
print(func_A2())   			#调用函数func_A2()
main.py的运行结果：
A2 in Pack2.

#方法二	源文件main.py 中的程序代码
import pack2.A2        		            #导入pack2.A2模块
print(pack2.A2.func_A2())	            #调用函数pack2.A2.func_A2()
main.py的运行结果：
A2 in Pack2.

🚀 [场景3] 在源文件A11.py中调用模块A2

在本场景中，源文件A11.py和模块A2分别在两个不同路径的包pack1和pack2中。实现步骤为：

在pack2文件夹下添加文件__init__.py
分别编写源文件A11.py和模块A2中的程序代码

#模块A2中的程序代码
#定义函数.
def func_A2():
  return 'A2 in Pack2'

#方法一	源文件A11.py中的程序代码
import sys
sys.path.append('d:\\PythonProjects\\p1\\pack\\pack2')
import A2              			#导入模块A2
print(A2.func_A2())      		#调用函数A2.func_A2()
A11.py的运行结果：
A2 in Pack2

#方法二	源文件A11.py中的程序代码
import sys
sys.path.append('d:\\PythonProjects\\p1\\pack')    #pack所在路径
import pack2.A2        		             #导入pack2.A2模块
print(pack2.A2.func_A2())		#调用函数pack2.A2.func_A2()
A11.py的运行结果：
A2 in Pack2

7.5 典型案例

7.5.1 使用Turtle绘制表面填充正方体

分析：

从视角上看正方体一般只能看到三个面，正立面、顶面和右侧面

只需要对这三个面(分别为填充红色、绿色和蓝色)进行绘制和填充即可

import turtle           		#导入模块
#画正方体正面
n = 100                         #正方体边长
turtle.penup()
turtle.goto(-100,-50)
turtle.pendown()
turtle.pencolor('red')      	#画笔颜色为红色
turtle.begin_fill()
turtle.fillcolor('red')     	#填充颜色为红色
for i in range(4):
  turtle.forward(n)
  turtle.left(90)
turtle.end_fill()
#画正方体顶面
turtle.penup()
turtle.goto(-100,n-50)
turtle.pendown()
turtle.pencolor('green')    	#画笔颜色为绿色
turtle.begin_fill()
turtle.fillcolor('green')  		#填充颜色为绿色
turtle.left(45)
turtle.forward(int(n * 0.6)) 	#倾斜边长为60
turtle.right(45)
turtle.forward(n)
turtle.left(360 - 135)
turtle.forward(int(n * 0.6)) 	#倾斜边长为60
turtle.end_fill()
#画正方体右侧面.
turtle.left(45)
turtle.penup()
turtle.goto(n-100,-50)
turtle.pendown()
turtle.pencolor('blue')      	#画笔颜色为蓝色
turtle.begin_fill()
turtle.fillcolor('blue')   		#填充颜色为蓝色
turtle.left(135)
turtle.forward(int(n * 0.6)) 	#倾斜边长为60
turtle.left(45)
turtle.forward(n)
turtle.left(135)
turtle.forward(int(n * 0.6)) 	#倾斜边长为60
turtle.right(90)
turtle.end_fill()
turtle.done()

7.5.2 使用NumPy和Matplotlib分析股票

使用NumPy和Matplotlib对股票000001(平安银行)在2018年7月的交易数据进行分析并显示股票收盘价走势图

分析：股票000001(平安银行)在2018年7月的交易数据存储在文件000001_stock01.csv中(可从网站资源中下载)，数据各列分别是date(日期)、open(开盘价)、high(最高价)、close(收盘价)、low(最低价)、volume(成交量)。股票000001在2018-07-2～2018-07-6的交易数据如下所示： 2018/7/2, 9.05, 9.05, 8.61, 8.55, 1315520.12 2018/7/3, 8.69, 8.7, 8.67, 8.45, 1274838.5 2018/7/4, 8.63, 8.75, 8.61, 8.61, 711153.38 2018/7/5, 8.62, 8.73, 8.6, 8.55, 835768.81 2018/7/6, 8.61, 8.78, 8.66, 8.45, 988282.75

使用NumPy对股票文件进行处理需要先将股票交易文件000001_stock01.csv中的不同列数据分别读到多个数组中保存
使用numpy.mean()函数计算收盘价和成交量的算术平均值
使用numpy.average()函数计算收盘价的加权平均价格
使用numpy.max()函数、np.min()函数分别计算股票最高价、最低价
使用numpy.ptp()函数计算股票最高价波动范围、股票最低价波动范围
使用matplotlib.pyplot中的相关函数绘制了股票000001在2018年7月的收盘价走势图

import numpy as np,os                                  
import matplotlib.pyplot as plt                       
#将000001_stock01.csv中的第4列(收盘价)、6列(成交量)数据读到数组c、v中
close,volume=np.loadtxt(os.getcwd()+"\\resource\\000001_stock01.csv", 
delimiter=',',usecols=(3,5),unpack=True)
print("收盘价的算术平均价格:%6.2f 元."%np.mean(close))  
print("成交量的算术平均值:%10.2f 手."%np.mean(volume)) 
#计算收盘价的加权平均价格(时间越靠近现在，权重越大). 
t = np.arange(len(close))       
print("收盘价的加权平均价格:%6.2f 元."%(np.average(close,weights=t)))
#将000001_stock01.csv中的第3列(最高价)、5列(最低价)数据读到数组high、low中
high,low = np.loadtxt(os.getcwd()+"\\resource\\000001_stock01.csv",delimiter=
',',usecols=(2,4),unpack=True)
print("股票最高价:%6.2f 元."%np.max(high))             
print("股票最低价:%6.2f 元."%np.min(low))              
print("股票最高价波动范围:%6.2f 元."%np.ptp(high))          
print("股票最低价波动范围:%6.2f 元."%np.ptp(low))       
"""----------显示股票000001在2018年7月的收盘价走势图--------------"""
#将000001_stock01.csv中的第1列(日期)数据读到数组date中.
date = np.loadtxt(os.getcwd()+"\\resource\\000001_stock01.csv",dtype=str,delimiter
=',',usecols=(0,),unpack=True)
plt.plot(date,close,'r',lw=2)                            
plt.rcParams['font.sans-serif']=['SimHei'] 
plt.xlabel('x轴-日期',fontsize=14)                       
plt.ylabel('y轴-收盘价(元)',fontsize=14)                 
plt.legend(['收盘价(元)'])                                
plt.gcf().autofmt_xdate()                                 
plt.show() 
'''
收盘价的算术平均价格: 8.96 元.
成交量的算术平均值: 928649.01 手.
收盘价的加权平均价格: 9.11 元.
股票最高价: 9.59 元.
股票最低价: 8.45 元.
股票最高价波动范围: 0.89 元.
股票最低价波动范围: 0.88 元.
'''

Python语言基础22

7.5.3 使用Pandas分析股票交易数据

使用Pandas对股票000001(平安银行)在2018年7月的交易数据进行统计分析分析：文件名为000001_stock02.csv(可从网站资源下载)。为了适应Pandas要求，为文件中各列数据添加了对应的列名 Date, open, high, low, close, volume 2018/7/2, 9.05, 9.05, 8.61, 8.55, 1315520.12 2018/7/3, 8.69, 8.7, 8.67, 8.45, 1274838.5 2018/7/4, 8.63, 8.75, 8.61, 8.61, 711153.38 2018/7/5, 8.62, 8.73, 8.6, 8.55, 835768.81 2018/7/6, 8.61, 8.78, 8.66, 8.45, 988282.75

使用Pandas中的pd.loc()函数、pd.count()函数对文件000001_stock02.csv中的股票数据进行筛选计数
使用NumPy中的np.where()函数结合在Pandas中获取的列数据对股票数据进行分组
调用Pandas中的pd.describe()函数对股票数据进行描述性统计
调用Pandas中的pd.corr()函数分别对股票数据进行相关性分析

import pandas as pd                      
import numpy as np  
import os                     
data = pd.read_csv(os.getcwd()+"\\resource\\000001_stock02.csv")    
print("1.股票最高价高于9.00元的天数:",(data.loc[(data['high']>=9.00),
['date']].count()).iloc[0])
print("2.股票收盘价分组:",np.where(data['close']>=9.00,'高','低'))
print("3.股票数据的描述性统计:")
print(data.describe())                       
print("4.股票数据的相关性分析:")
print(data.corr()) 
''' 
1.股票最高价高于9.00元的天数: 11
2.股票收盘价分组: ['低' '低' '低' '低' '低' '低' '低' '低' '低' '低' '低' '低' '低'
 			'低' '低' '高' '高' '高' '高' '高' '高' '高']
3.股票数据的描述性统计:
            open       high        low      　　close        volume
count  22.000000  22.000000  22.000000  22.000000  2.200000e+01
mean     8.939545   9.074545  	8.963636	   8.825455   9.286490e+05
std      0.300531   0.306404  	0.308507	   0.300630   4.077300e+05
min      8.600000   8.700000  	8.600000	   8.450000   3.753563e+05
25%      8.700000   8.815000  	8.705000	   8.610000   6.330535e+05
50%      8.805000   8.995000  	8.880000	   8.690000   8.345635e+05
75%      9.237500   9.427500  	9.250000	   9.135000   1.241252e+06
max      9.440000   9.590000  	9.420000	   9.330000   1.781688e+06
4.股票数据的相关性分析:
             open       high       low       close     volume
open   	1.000000  0.893724  0.814231  0.940618 -0.090078
high   	0.893724  1.000000  0.954085  0.901292  0.237201
low    	0.814231  0.954085  1.000000  0.896178  0.218021
close  	0.940618  0.901292  0.896178  1.000000 -0.140180
volume  -0.090078  0.237201  0.218021 -0.140180  1.000000
'''

7.5.4 使用图像处理库处理和显示图像

分析：

使用imageio库中的imread()函数读取图像文件

获取图像的数据类型和图像大小

使用imageio库中的imwrite()函数等修改图像颜色、图像大小，裁减图像

使用matplotlib.pyplot和matplotlib.image库中的相关函数绘制原始图像

import imageio,os,numpy
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from PIL import Image
tiger_jpg=imageio.imread(os.getcwd()+"\\resource\\tiger.jpg")	#读取图像.
print("图像的数据类型:", tiger_jpg.dtype)                                	                #获取图像数据类型.
img_shape = tiger_jpg.shape                                              	                #获取图像大小.
print("(图像大小, 通道数):", tiger_jpg.shape)
imageio.imwrite("tiger_mc.jpg", tiger_jpg * [1, 0.5, 0.5] )		#修改图像颜色并保存.
imageio.imwrite("timg_ms.jpg",numpy.array(Image.fromarray(tiger_jpg).resize((120,70)))) imageio.imwrite("timg_mi.jpg", tiger_jpg[50:130, 100:240])		#裁剪图像并保存.
"""-------------------------绘制图像--------------------------"""
plt.figure()                                         
plt.subplot(2, 2, 1)
tiger_jpg1 = mpimg.imread(os.getcwd()+"\\resource\\tiger.jpg")  	#读取图像.
plt.imshow(tiger_jpg1)                                                       	#显示图像.
plt.axis('off')                                                                           
plt.subplot(2, 2, 2)
tiger_jpg2 = mpimg.imread("tiger_mc.jpg")                                     
plt.imshow(tiger_jpg2)                                               
plt.axis('off')                                           
plt.subplot(2, 2, 3)
tiger_jpg3 = mpimg.imread("timg_ms.jpg")                         
plt.imshow(tiger_jpg3)                                                
plt.axis('off')                                           
plt.subplot(2, 2, 4)
tiger_jpg4 = mpimg.imread("timg_mi.jpg")                          
plt.imshow(tiger_jpg4)                                                
plt.axis('off')                                           
plt.show() 
'''
图像的数据类型: uint8
(图像大小, 通道数): (220, 352, 3)
'''

Python语言基础23

Python随堂笔记 常用模块和第三方库