文件分类
文本文件:文本文件存储的是普通“字符”文本,可以使 用记事本程序打开
二进制文件:二进制文件把数据内容用“字节”进行存储,无法用记事本打开。必须使用专用的软件解码。常见的有:MP4视频文件、MP3音频文件、JPG图片、doc文档等等
(以下操作文件的方式,上面两种类型的文件都可以使用,使用区别是在打开方式上,二进制文件多了一个b)
创建文件对象open()
open() 函数用于创建文件对象:open(文件路径[,打开方式])
打开方式如下:
文件的写入
# 文本文件的写入
## 创建文件对象
f = open(r"b.txt","a")
s = "hhc\nlxq\n"
## 写入数据
f.write(s)
## 关闭文件
f.close()
with open(r"b.txt","a") as f:
s = "hhc\nlxq\n"
f.write(s)
write()/writelines()写入数据
write(a) :把字符串 a 写入到文件中
writelines(b) :把字符串列表写入文件中,不添加换行符
f = open(r"bb.txt","w",encoding="utf-8")
s = ["hhc","lxq\n","gq\n"]
f.writelines(s)
f.close()
with open(r"bb.txt","w",encoding="utf-8") as f:
s = ["hhc\n","lxq\n","gq\n"]
f.writelines(s)
close()关闭文件流
由于文件底层是由操作系统控制,所以我们打开的文件对象必须显式调用 close() 方法关闭文件对象。当调用 close() 方法时,首先会把缓冲区数据写入文件(也可以直接调用 flush() 方法),再关闭文件,释放文件对象。为了确保打开的文件对象正常关闭,一般结合异常机制的 finally 或者 with 关键字实现无论何种情况都能关闭打开的文件对象。
try:
f = open(r"bb.txt","a")
s = "xssjhfs"
f.write(s)
except BaseException as e:
print(e)
finally:
f.close()
with语句
with关键字(上下文管理器)可以自动管理上下文资源,不论什么原因跳出with块,都能确保文件正确的关闭,并且可以在代码块执行完毕后自动还原进入该代码块时的现场。
s = ["zhangsan\n","lisi\n","wangwu\n"]
with open(r"bb.txt","a",encoding="utf-8") as f:
f.writelines(s)
文件的读取
read([size]) 从文件中读取 size 个字符,并作为结果返回。如果没有 size 参数,则读取整个文件。读取到文件末尾,会返回空字符串。
with open(r"bb.txt","r",encoding="utf-8") as f:
print(f.read(4)) # hhc
with open(r"bb.txt","r",encoding="utf-8") as f:
print(f.read())
readline() 读取一行内容作为结果返回。读取到文件末尾,会返回空字符串
with open(r"bb.txt","r",encoding="utf-8") as f:
print(f.readline())
with open(r"bb.txt", "r", encoding="utf-8") as f:
s = f.readline()
while s:
print(s)
s = f.readline()
readlines() 文本文件中,每一行作为一个字符串存入列表中,返回该列表
with open(r"bb.txt","r",encoding="utf-8") as f:
print(f.readlines()) # ['hhc\n', 'lxq\n', 'gq\n', 'xssjhfszhangsan\n', 'lisi\n', 'wangwu\n']
使用迭代器(每次返回一行)读取文本文件
with open(r"bb.txt","r") as f:
for a in f:
print(a,end="")
"""
hhc
lxq
gq
xssjhfszhangsan
lisi
wangwu
"""
为文本文件每一行的末尾增加行号
with open(r"bb.txt","r",encoding="utf-8") as f:
lines = f.readlines()
lines = [ line.rstrip()+" #"+str(index) +"\n" for index,line in zip(range(1,len(lines)+1),lines)]
with open(r"bb.txt","w",encoding="utf-8") as f:
f.writelines(lines)
二进制文件
二进制文件的处理流程和文本文件流程一致。首先还是要创建文件对象,不过,我们需要指定二进制模式。创建好二进制文件对象后,仍然可以使用 write() 、 read() 实现文件的读写操作。
读取图片文件,实现文件的拷贝:
with open("bqb.jpg", "rb") as srcFile, open("bqb2.jpg","wb") as destFile:
for line in srcFile:
destFile.write(line)
文件对象的常用属性和方法
文件对象的属性:
文件对象的打开模式:
文件对象的常用方法:
with open(r"e.txt","r",encoding="utf-8") as f:
print(f"文件名是:{f.name}") # 文件名是:e.txt
print(f.tell()) # 0
print(f.readline()) # abcdefghijklmnopqrstuvwxyz
print(f.tell()) # 26
f.seek(3,0)
print(f.readline()) # defghijklmnopqrstuvwxyz
序列化
序列化指的是:将对象转化成“串行化”数据形式,存储到硬盘或通过网络传输到其他地方。反序列化是指相反的过程,将读取到的“串 行化数据”转化成对象。
pickle.dump(obj, file) obj 就是要被序列化的对象, file 指的是存储的文件
pickle.load(file) 从 file 读取数据,反序列化成对象
print("=========序列化")
import pickle
with open("person.dat","wb") as f:
name = 'hhc'
age = 34
score = [90,95,100]
resume = {"name":name,"age":age,"score":score}
pickle.dump(resume,f)
print("=========反序列化")
with open(r"person.dat","rb") as f:
resume = pickle.load(f)
print(resume) # {'name': 'hhc', 'age': 34, 'score': [90, 95, 100]}
CSV文件的操作
csv是逗号分隔符文本格式,常用于数据交换、Excel文件和数据库数据的导入和导出
csv.reader 对象于从csv文件读取数据
import csv with open(r"d:\a.csv") as a:
a_csv = csv.reader(a) #创建csv对象, 它是一个包含所有数据的列表,每一行为一个元素
headers = next(a_csv) #获得列表对 象,包含标题行的信息
print(headers)
for row in a_csv: #循环打印各行内容
print(row)
csv.writer 对象写一个csv文件
import csv
headers = ["工号","姓名","年龄","地址","月薪"]
rows = [("1001","hhc",18,"西三旗1号院","50000"),("1002","lxq",19,"西三旗1号院","30000")]
with open(r"b.csv","w") as b:
b_csv = csv.writer(b) # 创建csv对象
b_csv.writerow(headers) # 写入一行(标题)
b_csv.writerows(rows) # 写入多行(数据)