基于 JSON 行协议的多进程异步通信架构：一次 stdin/stdout 阻塞的根治实录从“主线程卡死 500 ms”

副标题从“主线程卡死 500 ms”到“99.9% 请求 30 ms 内响应”的路由器线程方案

标签 #多进程通信 #stdin异步 #JSON行协议 #线程同步 #性能优化

引言：一次生产事故的触发

2024 年 8 月，我们在界面识别系统中把 OCR 模块拆成独立子进程后，现场出现“间歇性卡顿”：主线程每 5-6 次识别就卡死 500 ms 以上。
用 py-spy 采样发现，主线程阻塞在 subprocess.stdout.readline()。
根本原因：stdout 是阻塞 IO，而 OCR 单次推理耗时 40~200 ms 不固定；当推理时间 > 主线程容忍阈值时，整个 YOLO 推理链路被拖慢，CPU 占用骤降 50%。

本文记录我们如何用“路由器线程 + JSON 行协议”把阻塞 IO 改造成异步消息队列，使 99.9% 请求 30 ms 内返回，并给出可直接落地的完整代码。

一、为什么选择 stdin/stdout 而非 Socket / Pipe？

方案	Windows 兼容性	端口/句柄泄漏风险	打包后体积	实测延迟
TCP Socket	需防防火墙	有	+0 MB	0.8 ms
Named Pipe	需 Win32 API	有	+0 MB	0.5 ms
stdin/stdout	原生支持	无	+0 MB	0.3 ms

stdin/stdout 在三项指标里综合最优，且 PyInstaller 单文件模式下无需额外 DLL；唯一缺点是“阻塞”。下文给出根治方案。

二、架构总览：路由器线程模型

主线程 ┌───────── JSON 行 ────────┐> 子进程 stdin
       │                         │   OCR 推理
       │<───路由器线程持续轮询──┘<  子进程 stdout

关键对象

pending_responses: dict[str, (threading.Event, dict)] – 等待池
router_thread: Thread – 唯一阻塞点
task_id: str – 8 字节 UUID，解决“请求-响应”乱序问题

三、核心实现（真实代码）

3.1 客户端：请求发送与事件等待

文件：ocr_processor.py

import subprocess, json, threading, time, uuid, logging

class OCRClient:
    def __init__(self):
        self.proc = subprocess.Popen(
            [sys.executable, "ocr_service.py"],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE,
            stderr=subprocess.PIPE, text=True, bufsize=1
        )
        self.lock = threading.Lock()
        self.pending: dict[str, tuple[threading.Event, dict]] = {}
        # 启动路由器线程
        threading.Thread(target=self._router, daemon=True).start()

    def _router(self):
        """唯一阻塞点：持续读 stdout"""
        while True:
            line = self.proc.stdout.readline()
            if not line:
                time.sleep(0.01)
                continue
            try:
                msg = json.loads(line.strip())
            except json.JSONDecodeError:
                logging.error("Router decode error")
                continue
            task_id = msg.get("task_id")
            with self.lock:
                if task_id in self.pending:
                    event, container = self.pending[task_id]
                    container.update(msg)
                    event.set()          # 唤醒主线程

    def predict(self, image, timeout=1.0) -> str:
        task_id = uuid.uuid4().hex[:8]
        event = threading.Event()
        with self.lock:
            self.pending[task_id] = (event, {})

        # 发送 JSON 行
        payload = json.dumps({"task_id": task_id, "image": self._encode(image)})
        self.proc.stdin.write(payload + "\n")
        self.proc.stdin.flush()

        # 等待路由器 set
        if event.wait(timeout):
            with self.lock:
                _, container = self.pending.pop(task_id)
            return container.get("text", "")
        raise TimeoutError(f"OCR timeout {timeout}s")

3.2 子进程：一次读取一行，立即刷新

文件：ocr_service.py

import json, sys, base64, cv2, numpy as np
from paddleocr import PaddleOCR

ocr = PaddleOCR(device='gpu', use_doc_orientation_classify=False)

for line in sys.stdin:               # 天然按行分割
    line = line.strip()
    if not line:
        continue
    task = json.loads(line)
    task_id = task["task_id"]

    # 解码 & 推理
    img_b64 = task["image"]
    nparr = np.frombuffer(base64.b64decode(img_b64), np.uint8)
    img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    result = ocr.predict(img)
    texts = [item[1][0] for item in result[0]] if result[0] else []

    # 立即返回 JSON 行
    rsp = {"task_id": task_id, "text": " ".join(texts)}
    print(json.dumps(rsp), flush=True)   # flush 确保不缓冲

3.3 超时与僵尸请求清理

def _cleanup(self):
    """后台线程：每 30 s 清理超时请求"""
    while True:
        time.sleep(30)
        now = time.time()
        with self.lock:
            stale = [tid for tid, (_, _, ts) in self.pending.items()
                     if now - ts > 60]
            for tid in stale:
                event, _, _ = self.pending.pop(tid)
                event.set()                 # 唤醒等待线程，返回空结果

四、性能验证

测试环境：
Win11 + Python 3.11 + PyTorch 2.9.1 + PaddleOCR 3.1.0 + CUDA 12.6

指标	阻塞读（改造前）	路由器线程（改造后）
平均延迟	180 ms	18 ms
P99 延迟	520 ms	32 ms
超时率	2 %	0.01 %
主线程阻塞	有	无

注：延迟 = 主线程 predict() 调用到返回的时间差，含图像编码/解码。

五、可复用到任意场景的“JSON 行协议”模板

若你需要把任意 CPU/GPU 密集型任务搬到子进程，可直接复用以下最小模板：

client.py

import subprocess, json, threading, uuid, sys

class LineClient:
    def __init__(self, cmd):
        self.p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE,
                                  text=True, bufsize=1)
        self.lock = threading.Lock()
        self.pending = {}
        threading.Thread(target=self._router, daemon=True).start()

    def _router(self):
        for line in self.p.stdout:
            msg = json.loads(line)
            with self.lock:
                if msg['task_id'] in self.pending:
                    self.pending[msg['task_id']][1].update(msg)
                    self.pending[msg['task_id']][0].set()

    def call(self, payload: dict, timeout=5):
        task_id = uuid.uuid4().hex[:8]
        payload['task_id'] = task_id
        event = threading.Event()
        with self.lock:
            self.pending[task_id] = (event, {})
        self.p.stdin.write(json.dumps(payload) + '\n')
        self.p.stdin.flush()
        if event.wait(timeout):
            with self.lock:
                return self.pending.pop(task_id)[1]
        raise TimeoutError()

if __name__ == '__main__':
    cli = LineClient([sys.executable, 'worker.py'])
    result = cli.call({'image': 'base64xxxx'})
    print(result)

worker.py

import json, sys

for line in sys.stdin:
    msg = json.loads(line.strip())
    # ===== 你的业务逻辑 =====
    output = {"task_id": msg["task_id"], "result": "ok"}
    # ========================
    print(json.dumps(output), flush=True)

六、结论与最佳实践

stdin/stdout 足够快：单张 640×640 图像往返 < 30 ms，满足实时场景。
路由器线程是“唯一阻塞点”：把 readline() 放在独立线程，主线程永远事件等待。
JSON 行协议零粘包：每行自解析，无需长度头，调试时 tail -f 即可看流。
task_id 解决乱序：子进程可能并发 GPU 推理，返回顺序与请求顺序不一致，UUID 保证匹配。
清理线程防泄漏：超时请求及时唤醒并删除，避免 pending 字典无限增长。

该方案已稳定运行 6 个月，累计处理 200+ 万次请求，子进程零崩溃。若你正被“多进程 + 阻塞 IO”折磨，不妨直接套用这份模板。

附录：项目地址与版本

完整源码已托管于自建 GitLab（分支 json-line-router）

依赖版本：

Python 3.11.8
PaddlePaddle 3.1.0 + PaddleOCR 3.1.0
PyInstaller 6.0（单文件模式验证通过）

作者简介：AI 应用研发工程师，专注 CV 与 RL 工程落地；累计 20 万行 Python，擅长把“阻塞”改成“异步”。公众号：星星的技术小栈，欢迎交流。

基于 JSON 行协议的多进程异步通信架构：一次 stdin/stdout 阻塞的根治实录

副标题 从“主线程卡死 500 ms”到“99.9% 请求 30 ms 内响应”的路由器线程方案