压缩文件
tar
linux系统下的打包工具,只打包,不压缩,这是一种归档行为
- 创建tar文件
import tarfile
tar = tarfile.open(fname + ".tar.gz", "w:gz") # 打开tar.gz的压缩文件
'''
open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs) method of builtins.type instance
Open a tar archive for reading, writing or appending. Return
an appropriate TarFile class.
mode:
'r' or 'r:*' open for reading with transparent compression
'r:' open for reading exclusively uncompressed
'r:gz' open for reading with gzip compression
'r:bz2' open for reading with bzip2 compression
'r:xz' open for reading with lzma compression
'a' or 'a:' open for appending, creating the file if necessary
'w' or 'w:' open for writing without compression
'w:gz' open for writing with gzip compression
'w:bz2' open for writing with bzip2 compression
'w:xz' open for writing with lzma compression
'x' or 'x:' create a tarfile exclusively without compression, raise
an exception if the file is already created
'x:gz' create a gzip compressed tarfile, raise an exception
if the file is already created
'x:bz2' create a bzip2 compressed tarfile, raise an exception
if the file is already created
'x:xz' create an lzma compressed tarfile, raise an exception
if the file is already created
'r|*' open a stream of tar blocks with transparent compression
'r|' open an uncompressed stream of tar blocks for reading
'r|gz' open a gzip compressed stream of tar blocks
'r|bz2' open a bzip2 compressed stream of tar blocks
'r|xz' open an lzma compressed stream of tar blocks
'w|' open an uncompressed stream for writing
'w|gz' open a gzip compressed stream for writing
'w|bz2' open a bzip2 compressed stream for writing
'w|xz' open an lzma compressed stream for writing
'''
- 添加文件
tar.add(filepath)
Help on method add in module tarfile:
add(name, arcname=None, recursive=True, exclude=None, *, filter=None) method of tarfile.TarFile instance
Add the file `name' to the archive. `name' may be any type of file
(directory, fifo, symbolic link, etc.). If given, `arcname'
specifies an alternative name for the file in the archive.
Directories are added recursively by default. This can be avoided by
setting `recursive' to False. `exclude' is a function that should
return True for each filename to be excluded. `filter' is a function
that expects a TarInfo object argument and returns the changed
TarInfo object, if it returns None the TarInfo object will be
excluded from the archive.
- 一个归档压缩的demo
import tarfile
import os
BASE_DIR = '/Users/lienze/Desktop/'
tar = tarfile.open(os.path.join(BASE_DIR,'1.tar'),'w')
tar.add(os.path.join(BASE_DIR, '1.csv'),arcname='1.csv') # 提供文件到归档包里
tar.close()
- 一个提取归档文件的demo
import tarfile
import os
BASE_DIR = '/Users/lienze/Desktop/'
tar = tarfile.open(os.path.join(BASE_DIR,'1.tar'),'r')
file_names = tar.getnames()
for file_name in file_names:
tar.extract(file_name, BASE_DIR)
tar.close()
gz
即gzip,通常只能压缩一个文件。与tar结合起来就可以实现先打包,再压缩。
- 创建压缩文件
gz = gzip.open('1.gz','wb')
open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
Open a gzip-compressed file in binary or text mode.
The filename argument can be an actual filename (a str or bytes object), or
an existing file object to read from or write to.
The mode argument can be "r", "rb", "w", "wb", "x", "xb", "a" or "ab" for
binary mode, or "rt", "wt", "xt" or "at" for text mode. The default mode is
"rb", and the default compresslevel is 9.
For binary mode, this function is equivalent to the GzipFile constructor:
GzipFile(filename, mode, compresslevel). In this case, the encoding, errors
and newline arguments must not be provided.
For text mode, a GzipFile object is created, and wrapped in an
io.TextIOWrapper instance with the specified encoding, error handling
behavior, and line ending(s).
- 写入压缩文件内容
gz.writelines(fp)
fz.write(fp.read())
- 一个压缩为gz文件的小例子,记得后缀名要具有文件历史的后缀,这是因为gz的解压会直接去掉gz后缀
import gzip
help(gzip.open)
gz = gzip.open(
'/Users/lienze/Desktop/1.xls.gz',
'wb',
)
with open('/Users/lienze/Desktop/1.xls','rb') as fp:
gz.write(fp.read())
gz.close()
- 解压gz文件,只需要打开gz压缩文件,从其中读取即可
import gzip
gz = gzip.open(
'/Users/lienze/Desktop/1.xls.gz',
'rb',
)
with open('/Users/lienze/Desktop/1.xls','wb') as fp:
fp.write(gz.read())
gz.close()
zip
不同于gzip,虽然使用相似的算法,但可以打包压缩多个文件,压缩率低于tar.gz及rar
- 创建zip压缩包
import zipfile
z = zipfile.ZipFile(filename, 'w')
- 写入需要进行压缩的文件
z.write(filename, arcname=None, compress_type=None)
'''
filename: 待压缩文件
arcname: 压缩文件包里的文件名
'''
- 一个小案例,压缩一个目录下的文件
import zipfile
import os
BASE_DIR = '/Users/lienze/Desktop'
z = zipfile.ZipFile('/Users/lienze/Desktop/1.zip', 'w')
for file in os.listdir(BASE_DIR):
z.write(os.path.join(BASE_DIR,file))
z.close()
- 解压文件
import zipfile
import os
BASE_DIR = '/Users/lienze/Desktop'
z = zipfile.ZipFile('/Users/lienze/Desktop/1.zip', 'r')
for file in z.namelist():
with open(os.path.join(BASE_DIR,file),'wb') as fp:
content = z.read(file)
fp.write(content)
rar
打包压缩文件,最初用于DOS,基于window操作系统压缩率比zip高;但速度慢,随机访问的速度也慢