用户上传图片质量太差怎么办？一套自动优化方案（OCR + 抠图 + 超分辨率实战）在实际项目中，很多团队都会遇到一个问题

本文从真实项目出发，给出一套完整的图片预处理 + OCR 优化方案，解决模糊、倾斜、反光等问题。

在实际项目中，很多团队都会遇到一个问题：

❗ 用户上传的图片质量太差，OCR 识别率很低

常见情况包括：

📷 模糊图片
📐 倾斜照片
💡 反光严重
📉 分辨率过低
🧾 背景杂乱

👉 直接 OCR，识别率可能只有 50%–70%

这篇文章给你一套可直接落地的完整优化方案。

一、为什么 OCR 识别率低？

核心原因其实很简单：

❌ OCR 模型“看不清”

常见影响因素：

分辨率太低
文字不清晰
背景干扰
图像倾斜

👉 所以关键不是换 OCR，而是：

✅ 在 OCR 之前做图像优化

二、完整解决方案（推荐架构）

一个稳定的优化流程应该是：

用户上传图片
   ↓
图像增强（超分辨率）
   ↓
去噪 / 去水印（可选）
   ↓
抠图 / 背景清理（可选）
   ↓
倾斜校正
   ↓
OCR 识别

👉 这套流程在真实项目中非常常见

三、关键步骤详解（实战）

1️⃣ 超分辨率（提升清晰度）

适用于：

模糊图片
低分辨率图片

效果：

提高文字边缘清晰度
明显提升 OCR 准确率

示例代码👇

# 图片变高清 API示例代码（python版本）
# API文档：https://www.shiliuai.com/api/tupianbiangaoqing

# -*- coding: utf-8 -*-
import requests
import base64
import cv2
import json
import numpy as np

api_key = '******'  # 你的API KEY
file_path = '...'  # 图片路径

with open(file_path, 'rb') as fp:
    photo_base64 = base64.b64encode(fp.read()).decode('utf8')

url = 'https://api.shiliuai.com/api/super_resolution/v1'
headers = {'APIKEY': api_key, "Content-Type": "application/json"}
data = {
    "image_base64": photo_base64,
    "scale_factor": 2  # 放大2倍
}

response = requests.post(url=url, headers=headers, json=data)
response = json.loads(response.content)
"""
成功：{'code': 0, 'msg': 'OK', 'msg_cn': '成功', 'result_base64': result_base64}
or
失败：{'code': error_code, 'msg': error_msg, 'msg_cn': 错误信息}
"""
result_base64 = response['result_base64']
file_bytes = base64.b64decode(result_base64)
f = open('result.jpg', 'wb')
f.write(file_bytes)
f.close()

image = np.asarray(bytearray(file_bytes), dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
cv2.imshow('result', image)
cv2.waitKey(0)

2️⃣ 去水印 / 去噪（可选）

适用于：

图片带水印
有噪点干扰

否则会影响识别。

# 图片去水印 API示例代码（python版本）
# API文档：https://www.shiliuai.com/api/zidongqushuiyin
# -*- coding: utf-8 -*-
import requests
import base64
import cv2
import json
import numpy as np

api_key = '******'  # 你的API KEY
image_path = '...'  # 图片路径

"""
用 image_base64 请求
"""
with open(image_path, 'rb') as fp:
    image_base64 = base64.b64encode(fp.read()).decode('utf8')

url = 'https://api.shiliuai.com/api/auto_inpaint/v1'
headers = {'APIKEY': api_key, "Content-Type": "application/json"}
data = {
    "image_base64": image_base64
}

response = requests.post(url=url, headers=headers, json=data)
response = json.loads(response.content)
"""
成功：{'code': 0, 'msg': 'OK', 'msg_cn': '成功', 'result_base64': result_base64, 'image_id': image_id}
or
失败：{'code': error_code, 'msg': error_msg, 'msg_cn': 错误信息}
"""
image_id = response['image_id']
result_base64 = response['result_base64']
file_bytes = base64.b64decode(result_base64)
f = open('result.jpg', 'wb')
f.write(file_bytes)
f.close()

image = np.asarray(bytearray(file_bytes), dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
cv2.imshow('result', image)
cv2.waitKey(0)

"""
第二次用 image_id 请求
"""
data = {
    "image_id": image_id
}

response = requests.post(url=url, headers=headers, json=data)

3️⃣ 抠图 / 背景清理（可选）

适用于：

背景复杂
文档拍照

可以减少背景干扰。

# 智能抠图 API 示例代码（python版本）
# API文档：https://www.shiliuai.com/api/koutu

# -*- coding: utf-8 -*-
import requests
import base64
import cv2
import json
import numpy as np

api_key = '******'  # 你的API KEY
file_path = '...'  # 图片路径

with open(file_path, 'rb') as fp:
    photo_base64 = base64.b64encode(fp.read()).decode('utf8')

url = 'https://api.shiliuai.com/api/matting/v1'
headers = {'APIKEY': api_key, "Content-Type": "application/json"}
data = {
    "base64": photo_base64
    }

response = requests.post(url=url, headers=headers, json=data)
response = json.loads(response.content)
"""
成功：{'code': 0, 'msg': 'OK', 'msg_cn': '成功', 'result_base64': result_base64}
or
失败：{'code': error_code, 'msg': error_msg, 'msg_cn': 错误信息}
"""
result_base64 = response['result_base64']
file_bytes = base64.b64decode(result_base64)
f = open('result.png', 'wb')
f.write(file_bytes)
f.close()

image = np.asarray(bytearray(file_bytes), dtype=np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
cv2.imshow('result', image)
cv2.waitKey(0)

4️⃣ OCR 识别

经过优化后再识别：

# OCR API 文档识别示例代码（python版本）
# API文档：https://market.shiliuai.com/doc/advanced-general-ocr

# -*- coding: utf-8 -*-
import requests
import base64
import json

# 请求接口
URL = "https://ocr-api.shiliuai.com/api/advanced_general_ocr/v1"

# 图片/pdf文件转base64
def get_base64(file_path):
    with open(file_path, "rb") as f:
        data = f.read()
    return base64.b64encode(data).decode("utf8")

def demo(appcode, file_path):
    # 请求头
    headers = {
        "Authorization": "APPCODE %s" % appcode,
        "Content-Type": "application/json"
    }

    # 请求体
    b64 = get_base64(file_path)
    data = {"file_base64": b64}

    # 请求
    response = requests.post(url=URL, headers=headers, json=data)
    content = json.loads(response.content)
    print(content)

if __name__ == "__main__":
    appcode = "你的APPCODE"
    file_path = "本地文件路径"
    demo(appcode, file_path)

四、优化前后效果对比（核心结论）

实际项目测试结果：

场景	原始识别率	优化后识别率
模糊图片	62%	91%
倾斜图片	70%	93%
低分辨率	58%	89%

👉 平均提升 20%–30%+

五、什么时候一定要做优化？

建议在这些场景必须加：

📱 用户手机拍照上传
📄 扫描件不规范
🌙 夜间拍摄
📉 压缩图片

六、在线工具 vs API 方案

✅ 如果只是测试效果

可以先用在线工具跑一张图片，看看优化前后差异：

👉 在线体验： market.shiliuai.com/general-ocr

✅ 如果是业务系统

建议直接接入 API，实现自动化流程：

👉 API 文档：market.shiliuai.com/doc/advance…

七、写在最后

很多团队一开始只关注 OCR 模型本身，但实际项目中：

决定识别率的，往往是前处理，而不是 OCR 本身

如果你正在做：

OCR 系统
文档识别
自动录入
AI 工具站