gevent mokey.path_all 天坑

2,171 阅读4分钟

「这是我参与2022首次更文挑战的第1天,活动详情查看:2022首次更文挑战

代码

坑了我一上午加半个下午 先说结论,这是monkey patch_all 导入顺序的问题。优先导入gevent和monkey

import gevent as gevent
from gevent.pool import Pool
from tqdm import tqdm
from gevent import monkey;monkey.patch_all()

def search_by_gevent(content):
    res=search_question(content[1])
    if not res:
        clustering_contents.append(content)
pool = Pool(2)
threads = [pool.spawn(search_by_gevent, c) for c in contents]
gevent.joinall(threads,timeout=10)

代码很简单 就是我需要对我的contents中每一条进行一个内部api的查询,由于是通过request去post并且有延迟,这里本来使用多线程,但是同时对比了python的多线程和协程,效果差不多,就使用gevent来写,昨天写完都还好好的,然后就出现下面的问题

我上至stackoverflow,github,下至csdn,简书,没有一个方法是能解决问题的。 下面就是我遇见的问题。

问题

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "D:\Anaconda\envs\Custermer_Question_Answer_Clustering\lib\site-packages\gevent\thread.py", line 121, in acquire
    acquired = BoundedSemaphore.acquire(self, blocking, timeout)
  File "src/gevent/_semaphore.py", line 180, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 259, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 249, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_abstract_linkable.py", line 521, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait
  File "src/gevent/_abstract_linkable.py", line 487, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 490, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 442, in gevent._gevent_c_abstract_linkable.AbstractLinkable._AbstractLinkable__wait_to_be_notified
  File "src/gevent/_abstract_linkable.py", line 451, in gevent._gevent_c_abstract_linkable.AbstractLinkable._switch_to_hub
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
gevent.exceptions.LoopExit: This operation would block forever
	Hub: <Hub '' at 0x1cd437e81c8 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF94D32DE70> default pending=0 ref=0 callbacks=0 resolver=<gevent.resolver.thread.Resolver at 0x1cd5e058f88 pool=<ThreadPool at 0x1cd5e051dd8 tasks=0 size=2 maxsize=10 hub=<Hub at 0x1cd437e81c8 thread_ident=0x6a0>>> threadpool=<ThreadPool at 0x1cd5e051dd8 tasks=0 size=2 maxsize=10 hub=<Hub at 0x1cd437e81c8 thread_ident=0x6a0>> thread_ident=0x6a0>
	Handles:
[HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD3DB062A8>, type=b'check', watcher=<gevent.libuv.loop.loop at 0x1cd437f47c8 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF94D32DE70> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD3D305838>, type=b'timer', watcher=<gevent.libuv.loop.loop at 0x1cd437f47c8 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF94D32DE70> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD3DB06A38>, type=b'prepare', watcher=<gevent.libuv.loop.loop at 0x1cd437f47c8 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF94D32DE70> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD3DB068D8>, type=b'check', watcher=<gevent.libuv.loop.loop at 0x1cd437f47c8 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF94D32DE70> default pending=0 ref=0 callbacks=0>, ref=1, active=0, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD3DEB9898>, type=b'async', watcher=<async_ at 0x1cd4388b508 callback=<function AbstractLoop._init_loop_and_aux_watchers.<locals>.<lambda> at 0x000001CD438840D8> args=() watcher=<cdata 'struct uv_async_s *' owning 224 bytes> handle=<cdata 'void *' 0x000001CD43887778> ref=False>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001CD567781E8>, type=b'async', watcher=<async_ at 0x1cd5e058c88 callback=<bound method ThreadPool._on_fork of <ThreadPool at 0x1cd5e051dd8 tasks=0 size=2 maxsize=10 hub=<Hub at 0x1cd437e81c8 thread_ident=0x6a0>>> args=() watcher=<cdata 'struct uv_async_s *' owning 224 bytes> handle=<cdata 'void *' 0x000001CD587E74F8> ref=False>, ref=0, active=1, closing=0)]

解决方法

问题出在monkey.patch_all()导入的顺序,修改成下面形式就没有问题。但是解决的原因不详。 很有可能是在search_by_gevent,中调用了tqdm,并且tqdm中间存在什么可以被mokey补丁,但是没有补丁上导致的。(个人猜想) 如果有大佬知道是什么原因还望留言告知

import gevent as gevent
from gevent import monkey;monkey.patch_all()
from gevent.pool import Pool
from tqdm import tqdm

这样也没问题,monkey补丁和协程池Pool谁先谁后关系不大

import gevent as gevent
from gevent.pool import Pool
from gevent import monkey;monkey.patch_all()
from tqdm import tqdm

验证实验

写着写着突然有灵感了就去验证一下我上述的结论。果然是monkey.patch_all()导入顺序的原因。验证代码 正确的代码

import gevent as gevent
from gevent import monkey;monkey.patch_all()
from gevent.pool import Pool
from tqdm import tqdm
import time



def test(num):
	res=0
	for i in tqdm(range(num)):
		res+=i
	time.sleep(1)
	print(num,res)

pool = Pool(2)
threads = [pool.spawn(test, i) for i in range(5)]
gevent.joinall(threads,timeout=10)

产生问题的代码

from tqdm import tqdm
import gevent as gevent
from gevent import monkey;monkey.patch_all()
from gevent.pool import Pool
import time

def test(num):
	res=0
	for i in tqdm(range(num)):
		res+=i
	time.sleep(1)
	print(num,res)

pool = Pool(2)
threads = [pool.spawn(test, i) for i in range(5)]
gevent.joinall(threads,timeout=10)

问题代码会卡住,ctrl+C 停止后就会报错,复现了上面问题

 File "D:\Anaconda\lib\site-packages\gevent\_ffi\loop.py", line 270, in python_check_callback
    def python_check_callback(self, watcher_ptr): # pylint:disable=unused-argument
KeyboardInterrupt
2022-02-09T07:59:34Z
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "D:\Anaconda\lib\site-packages\gevent\thread.py", line 121, in acquire
    acquired = BoundedSemaphore.acquire(self, blocking, timeout)
  File "src/gevent/_semaphore.py", line 180, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 249, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_abstract_linkable.py", line 521, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait
  File "src/gevent/_abstract_linkable.py", line 487, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 490, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 442, in gevent._gevent_c_abstract_linkable.AbstractLinkable._AbstractLinkable__wait_to_be_notified
  File "src/gevent/_abstract_linkable.py", line 451, in gevent._gevent_c_abstract_linkable.AbstractLinkable._switch_to_hub
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
  File "D:\Anaconda\lib\site-packages\gevent\_ffi\loop.py", line 270, in python_check_callback
    def python_check_callback(self, watcher_ptr): # pylint:disable=unused-argument
KeyboardInterrupt
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "D:\Anaconda\lib\site-packages\gevent\thread.py", line 121, in acquire
    acquired = BoundedSemaphore.acquire(self, blocking, timeout)
  File "src/gevent/_semaphore.py", line 180, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 259, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 249, in gevent._gevent_c_semaphore.Semaphore.acquire
  File "src/gevent/_abstract_linkable.py", line 521, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait
  File "src/gevent/_abstract_linkable.py", line 487, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 490, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 442, in gevent._gevent_c_abstract_linkable.AbstractLinkable._AbstractLinkable__wait_to_be_notified
  File "src/gevent/_abstract_linkable.py", line 451, in gevent._gevent_c_abstract_linkable.AbstractLinkable._switch_to_hub
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
gevent.exceptions.LoopExit: This operation would block forever
        Hub: <Hub '' at 0x1c9dad94a00 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF96B29DE90> default pending=0 ref=0 callbacks=0 thread_ident=0x1c90>
        Handles:
[HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001C9D7E73DB8>, type=b'check', watcher=<gevent.libuv.loop.loop at 0x1c9dadcee50 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF96B29DE90> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001C9DA03E068>, type=b'timer', watcher=<gevent.libuv.loop.loop at 0x1c9dadcee50 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF96B29DE90> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001C9D7E73628>, type=b'prepare', watcher=<gevent.libuv.loop.loop at 0x1c9dadcee50 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF96B29DE90> default pending=0 ref=0 callbacks=0>, ref=0, active=1, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001C9D7E73788>, type=b'check', watcher=<gevent.libuv.loop.loop at 0x1c9dadcee50 backend=default ptr=<cdata 'struct uv_loop_s *' 0x00007FF96B29DE90> default pending=0 ref=0 callbacks=0>, ref=1, active=0, closing=0),
 HandleState(handle=<cdata 'struct uv_handle_s *' 0x000001C9DA1118C8>, type=b'async', watcher=<async_ at 0x1c9dae96fd0 callback=<function AbstractLoop._init_loop_and_aux_watchers.<locals>.<lambda> at 0x000001C9DAE8DF70> args=() watcher=<cdata 'struct uv_async_s *' owning 224 bytes> handle=<cdata 'void *' 0x000001C9DAE98A00> ref=False>, ref=0, active=1, closing=0)]

我又对导入顺序进行了调整,发现只和tqdm和monkey.patch_all()导入顺序有关,只要tqdm比monkey导入前,就会出问题。

import time
import gevent as gevent
from gevent import monkey;monkey.patch_all()
from gevent.pool import Pool
from tqdm import tqdm

import time
import gevent as gevent
from gevent import monkey;monkey.patch_all()
from tqdm import tqdm
from gevent.pool import Pool

个人建议monkey和gevent优先导入。