import glob
import os
import cv2
### Loop through all jpg files in the current folder ### Resize each one to size 600x600for image_filename in glob.glob("*.jpg"):
### Read in the image data
img = cv2.imread(image_filename)
### Resize the image
img = cv2.resize(img, (600, 600))
上面的程序遵循你在处理数据脚本时经常看到的简单模式:
首先从需要处理内容的文件(或其他数据)列表开始。
使用 for 循环逐个处理每个数据,然后在每个循环迭代上运行预处理。
让我们在一个包含 1000 个 jpeg 文件的文件夹上测试这个程序,看看运行它需要多久:
time python standard_res_conversion.py
在我的酷睿 i7-8700k 6 核 CPU 上,运行时间为 7.9864 秒!在这样的高端 CPU 上,这种速度看起来是难以让人接受的,看看我们能做点什么。
import glob
import os
import cv2
import concurrent.futures
def load_and_resize(image_filename):
### Read in the image data
img = cv2.imread(image_filename)
### Resize the image
img = cv2.resize(img, (600, 600))
### Create a pool of processes. By default, one is created for each CPU in your machine.
with concurrent.futures.ProcessPoolExecutor() as executor:
### Get a list of files to process
image_files = glob.glob("*.jpg")
### Process the list of files, but split the work across the process pool to use all CPUs### Loop through all jpg files in the current folder ### Resize each one to size 600x600
executor.map(load_and_resize, image_files)
从以上代码中摘出一行:
with concurrent.futures.ProcessPoolExecutor() as executor:
你的 CPU 核越多,启动的 Python 进程越多,我的 CPU 有 6 个核。实际处理代码如下: