第二部分
文章的地址:sdiehl.github.io/gevent-tuto…
Data Structures 数据结构
Events 事件
Events are a form of asynchronous communication between Greenlets.
events是Greenlets内部一种异步通讯的形式.
import gevent
from gevent.event import Event
'''
Illustrates the use of events
'''
evt = Event()
def setter():
'''After 3 seconds, wake all threads waiting on the value of evt'''
print('A: Hey wait for me, I have to do something')
gevent.sleep(3)
print("Ok, I'm done")
evt.set()
def waiter():
'''After 3 seconds the get call will unblock'''
print("I'll wait for you")
evt.wait() # blocking
print("It's about time")
def main():
gevent.joinall([
gevent.spawn(setter),
gevent.spawn(waiter),
gevent.spawn(waiter),
gevent.spawn(waiter),
gevent.spawn(waiter),
gevent.spawn(waiter)
])
if __name__ == '__main__':
main()
"""
A: Hey wait for me, I have to do something
I'll wait for you
I'll wait for you
I'll wait for you
I'll wait for you
I'll wait for you
Ok, I'm done
It's about time
It's about time
It's about time
It's about time
It's about time
"""
An extension of the Event object is the AsyncResult which allows you to send a value along with the wakeup call. This is sometimes called a future or a deferred, since it holds a reference to a future value that can be set on an arbitrary time schedule.
Event对象的一个扩展AsyncResult,可以让你发送一个值连同唤醒调用.这样有时候调用一个将来或者一个延迟,然后它就可以保存涉及到一个将来的值可以用于任意时间表.
import gevent
from gevent.event import AsyncResult
a = AsyncResult()
def setter():
"""
After 3 seconds set the result of a.
"""
gevent.sleep(3)
a.set('Hello!')
def waiter():
"""
After 3 seconds the get call will unblock after the setter
puts a value into the AsyncResult.
"""
print(a.get())
gevent.joinall([
gevent.spawn(setter),
gevent.spawn(waiter),
])
"""
Hello
"""
Queues 队列
Queues are ordered sets of data that have the usual put
/ get
operations but are written in a way such that they can be safely manipulated across Greenlets.
Queues是一组数据的排序,有常用的 put
/ get
操作,但也可以以另一种方式写入,就是当他们在Greenlets之间可以安全地操作.
For example if one Greenlet grabs an item off of the queue, the same item will not grabbed by another Greenlet executing simultaneously.
例如如果一个Greenlet在队列中取出一个元素,同样的元素就不会被另一个正在执行的Greenlet取出.
import gevent
from gevent.queue import Queue
tasks = Queue()
def worker(n):
while not tasks.empty():
task = tasks.get()
print('Worker %s got task %s' % (n, task))
gevent.sleep(0)
print('Quitting time!')
def boss():
for i in xrange(1,25):
tasks.put_nowait(i)
gevent.spawn(boss).join()
gevent.joinall([
gevent.spawn(worker, 'steve'),
gevent.spawn(worker, 'john'),
gevent.spawn(worker, 'nancy'),
])
"""
Worker steve got task 1
Worker john got task 2
Worker nancy got task 3
Worker steve got task 4
Worker nancy got task 5
Worker john got task 6
Worker steve got task 7
Worker john got task 8
Worker nancy got task 9
Worker steve got task 10
Worker nancy got task 11
Worker john got task 12
Worker steve got task 13
Worker john got task 14
Worker nancy got task 15
Worker steve got task 16
Worker nancy got task 17
Worker john got task 18
Worker steve got task 19
Worker john got task 20
Worker nancy got task 21
Worker steve got task 22
Worker nancy got task 23
Worker john got task 24
Quitting time!
Quitting time!
Quitting time!
"""
Queues can also block on either put
or get
as the need arises.
Queues也可以在 put
或者get
的时候阻塞,如果有必要的话.
Each of the put
and get
operations has a non-blocking counterpart, put_nowait
and get_nowait
which will not block, but instead raise either gevent.queue.Empty
or gevent.queue.Full
in the operation is not possible.
每个put
和get
操作不会有阻塞的情况.put_nowait
和get_nowait
也不会阻塞,但在操作中抛出gevent.queue.Empty
或者gevent.queue.Full
是不可能的.
In this example we have the boss running simultaneously to the workers and have a restriction on the Queue that it can contain no more than three elements. This restriction means that the put
operation will block until there is space on the queue. Conversely the get
operation will block if there are no elements on the queue to fetch, it also takes a timeout argument to allow for the queue to exit with the exception gevent.queue.Empty
if no work can found within the time frame of the Timeout.
在这个例子中,我们有一个boss同时给工人任务,有一个限制是说队列中不能超过3个工人,这个限制意味着put
操作会阻塞当队伍中没有空间.相反的get
操作会阻塞如果队列中没有元素可取,也可以加入一个timeout的参数来允许队列带着一个异常gevent.queue.Empty
退出,如果在Timeout时间范围内没有工作.
import gevent
from gevent.queue import Queue, Empty
tasks = Queue(maxsize=3)
def worker(n):
try:
while True:
task = tasks.get(timeout=1) # decrements queue size by 1
print('Worker %s got task %s' % (n, task))
gevent.sleep(0)
except Empty:
print('Quitting time!')
def boss():
"""
Boss will wait to hand out work until a individual worker is
free since the maxsize of the task queue is 3.
"""
for i in xrange(1,10):
tasks.put(i)
print('Assigned all work in iteration 1')
for i in xrange(10,20):
tasks.put(i)
print('Assigned all work in iteration 2')
gevent.joinall([
gevent.spawn(boss),
gevent.spawn(worker, 'steve'),
gevent.spawn(worker, 'john'),
gevent.spawn(worker, 'bob'),
])
"""
Worker steve got task 1
Worker john got task 2
Worker bob got task 3
Worker steve got task 4
Worker bob got task 5
Worker john got task 6
Assigned all work in iteration 1
Worker steve got task 7
Worker john got task 8
Worker bob got task 9
Worker steve got task 10
Worker bob got task 11
Worker john got task 12
Worker steve got task 13
Worker john got task 14
Worker bob got task 15
Worker steve got task 16
Worker bob got task 17
Worker john got task 18
Assigned all work in iteration 2
Worker steve got task 19
Quitting time!
Quitting time!
Quitting time!
"""
Groups and Pools
A group is a collection of running greenlets which are managed and scheduled together as group. It also doubles as parallel dispatcher that mirrors the Python multiprocessing
library.
组是运行greenlet的集合,这些greenlet被管理并作为组一起计划。它还可以作为并行调度程序来镜像Python的multiprocessing
库。
import gevent
from gevent.pool import Group
def talk(msg):
for i in xrange(3):
print(msg)
g1 = gevent.spawn(talk, 'bar')
g2 = gevent.spawn(talk, 'foo')
g3 = gevent.spawn(talk, 'fizz')
group = Group()
group.add(g1)
group.add(g2)
group.join()
group.add(g3)
group.join()
"""
bar
bar
bar
foo
foo
foo
fizz
fizz
fizz
"""
This is very useful for managing groups of asynchronous tasks.
这对于管理异步任务组非常有用。
As mentioned above, Group
also provides an API for dispatching jobs to grouped greenlets and collecting their results in various ways.
如上所述,Group
还提供了一个API,用于将jobs分派给分组的greenlet,并以各种方式收集它们的结果。
import gevent
from gevent import getcurrent
from gevent.pool import Group
group = Group()
def hello_from(n):
print('Size of group %s' % len(group))
print('Hello from Greenlet %s' % id(getcurrent()))
group.map(hello_from, range(3))
def intensive(n):
gevent.sleep(3 - n)
return 'task', n
print('Ordered')
ogroup = Group()
for i in ogroup.imap(intensive, range(3)):
print(i)
print('Unordered')
igroup = Group()
for i in igroup.imap_unordered(intensive, range(3)):
print(i)
"""
Size of group 3
Hello from Greenlet 4340152592
Size of group 3
Hello from Greenlet 4340928912
Size of group 3
Hello from Greenlet 4340928592
Ordered
('task', 0)
('task', 1)
('task', 2)
Unordered
('task', 2)
('task', 1)
('task', 0)
"""
A pool is a structure designed for handling dynamic numbers of greenlets which need to be concurrency-limited. This is often desirable in cases where one wants to do many network or IO bound tasks in parallel.
池是一种设计用于处理需要并发限制的动态greenlet数的结构。在需要并行执行多个网络或IO绑定任务的情况下,这通常是可取的。
import gevent
from gevent.pool import Pool
pool = Pool(2)
def hello_from(n):
print('Size of pool %s' % len(pool))
pool.map(hello_from, range(3))
"""
Size of pool 2
Size of pool 2
Size of pool 1
"""
Often when building gevent driven services one will center the entire service around a pool structure. An example might be a class which polls on various sockets.
通常,当构建gevent驱动的服务时,您将围绕池结构将整个服务集中。一个例子可能是在各种套接字上轮询的类。
from gevent.pool import Pool
class SocketPool(object):
def __init__(self):
self.pool = Pool(1000)
self.pool.start()
def listen(self, socket):
while True:
socket.recv()
def add_handler(self, socket):
if self.pool.full():
raise Exception("At maximum pool size")
else:
self.pool.spawn(self.listen, socket)
def shutdown(self):
self.pool.kill()
Locks and Semaphores
A semaphore is a low level synchronization primitive that allows greenlets to coordinate and limit concurrent access or execution. A semaphore exposes two methods, acquire
and release
The difference between the number of times a semaphore has been acquired and released is called the bound of the semaphore. If a semaphore bound reaches 0 it will block until another greenlet releases its acquisition.
信号量是一种低级同步原语,允许greenlet协调和限制并发访问或执行。一个信号量公开了两个方法,acquire
和release
一个信号量被获取和释放的次数之差称为信号量的界限。如果一个信号量绑定达到0,它将阻塞,直到另一个greenlet释放它的捕获。
from gevent import sleep
from gevent.pool import Pool
from gevent.coros import BoundedSemaphore
sem = BoundedSemaphore(2)
def worker1(n):
sem.acquire()
print('Worker %i acquired semaphore' % n)
sleep(0)
sem.release()
print('Worker %i released semaphore' % n)
def worker2(n):
with sem:
print('Worker %i acquired semaphore' % n)
sleep(0)
print('Worker %i released semaphore' % n)
pool = Pool()
pool.map(worker1, range(0,2))
pool.map(worker2, range(3,6))
"""
Worker 0 acquired semaphore
Worker 1 acquired semaphore
Worker 0 released semaphore
Worker 1 released semaphore
Worker 3 acquired semaphore
Worker 4 acquired semaphore
Worker 3 released semaphore
Worker 4 released semaphore
Worker 5 acquired semaphore
Worker 5 released semaphore
"""
A semaphore with bound of 1 is known as a Lock. it provides exclusive execution to one greenlet. They are often used to ensure that resources are only in use at one time in the context of a program.
界限为1的信号量称为锁。它为一个greenlet提供独占执行。它们通常用于确保资源只在程序上下文中一次使用。
Thread Locals
Gevent also allows you to specify data which is local to the greenlet context. Internally, this is implemented as a global lookup which addresses a private namespace keyed by the greenlet's getcurrent()
value.
Gevent还允许您指定greenlet上下文的本地数据。在内部,这是作为一个全局查找来实现的,它处理由greenlet的getcurrent()
值设置密钥的私有名称空间。
import gevent
from gevent.local import local
stash = local()
def f1():
stash.x = 1
print(stash.x)
def f2():
stash.y = 2
print(stash.y)
try:
stash.x
except AttributeError:
print("x is not local to f2")
g1 = gevent.spawn(f1)
g2 = gevent.spawn(f2)
gevent.joinall([g1, g2])
"""
1
2
x is not local to f2
"""
Many web frameworks that use gevent store HTTP session objects inside gevent thread locals. For example, using the Werkzeug utility library and its proxy object we can create Flask-style request objects.
许多使用gevent的web框架将HTTP会话对象存储在gevent线程局部变量中。例如,使用Werkzeug实用程序库及其代理对象,我们可以创建Flask样式的请求对象。
from gevent.local import local
from werkzeug.local import LocalProxy
from werkzeug.wrappers import Request
from contextlib import contextmanager
from gevent.wsgi import WSGIServer
_requests = local()
request = LocalProxy(lambda: _requests.request)
@contextmanager
def sessionmanager(environ):
_requests.request = Request(environ)
yield
_requests.request = None
def logic():
return "Hello " + request.remote_addr
def application(environ, start_response):
status = '200 OK'
with sessionmanager(environ):
body = logic()
headers = [
('Content-Type', 'text/html')
]
start_response(status, headers)
return [body]
WSGIServer(('', 8000), application).serve_forever()
Flask's system is a bit more sophisticated than this example, but the idea of using thread locals as local session storage is nonetheless the same.
Flask的系统比这个示例要复杂一些,但是使用线程局部变量作为本地会话存储的想法是相同的。
Subprocess
As of gevent 1.0, gevent.subprocess
-- a patched version of Python's subprocess
module -- has been added. It supports cooperative waiting on subprocesses.
从gevent 1.0开始,已经添加了“gevent.subprocess”——Python“subprocess”模块的修补版本。支持协同等待子流程。
import gevent
from gevent.subprocess import Popen, PIPE
def cron():
while True:
print("cron")
gevent.sleep(0.2)
g = gevent.spawn(cron)
sub = Popen(['sleep 1; uname'], stdout=PIPE, shell=True)
out, err = sub.communicate()
g.kill()
print(out.rstrip())
"""
cron
cron
cron
cron
cron
Linux
"""
Many people also want to use gevent
and multiprocessing
together. One of the most obvious challenges is that inter-process communication provided by multiprocessing
is not cooperative by default. Since multiprocessing.Connection
-based objects (such as Pipe
) expose their underlying file descriptors, gevent.socket.wait_read
and wait_write
can be used to cooperatively wait for ready-to-read/ready-to-write events before actually reading/writing:
许多人还希望将“gevent”和“multiprocessing”一起使用。最明显的挑战之一是,默认情况下,由“多处理”提供的进程间通信不合作。由于基于multiprocessing.Connection
-的对象(例如Pipe
)暴露其底层文件描述符gevent.socket.wait_read
和wait_write
可以用于协作等待准备好的读/写事件,然后才能实际读/写:
import gevent
from multiprocessing import Process, Pipe
from gevent.socket import wait_read, wait_write
# To Process
a, b = Pipe()
# From Process
c, d = Pipe()
def relay():
for i in range(10):
msg = b.recv()
c.send(msg + " in " + str(i))
def put_msg():
for i in range(10):
wait_write(a.fileno())
a.send('hi')
def get_msg():
for i in range(10):
wait_read(d.fileno())
print(d.recv())
if __name__ == '__main__':
proc = Process(target=relay)
proc.start()
g1 = gevent.spawn(get_msg)
g2 = gevent.spawn(put_msg)
gevent.joinall([g1, g2], timeout=1)
Note, however, that the combination of multiprocessing
and gevent brings along certain OS-dependent pitfalls, among others:
- After forking on POSIX-compliant systems gevent's state in the child is ill-posed. One side effect is that greenlets spawned before
multiprocessing.Process
creation run in both, parent and child process. a.send()
input_msg()
above might still block the calling thread non-cooperatively: a ready-to-write event only ensures that one byte can be written. The underlying buffer might be full before the attempted write is complete.- The
wait_write()
/wait_read()
-based approach as indicated above does not work on Windows (IOError: 3 is not a socket (files are not supported)
), because Windows cannot watch pipes for events.
The Python package gipc overcomes these challenges for you in a largely transparent fashion on both, POSIX-compliant and Windows systems. It provides gevent-aware multiprocessing.Process
-based child processes and gevent-cooperative inter-process communication based on pipes.
Actors
The actor model is a higher level concurrency model popularized by the language Erlang. In short the main idea is that you have a collection of independent Actors which have an inbox from which they receive messages from other Actors. The main loop inside the Actor iterates through its messages and takes action according to its desired behavior.
actor模型是Erlang语言推广的高级并发模型。简言之,主要思想是您有一个独立参与者的集合,这些参与者有一个收件箱,从中可以接收来自其他参与者的消息。Actor中的主循环遍历其消息,并根据所需的行为执行操作。
Gevent does not have a primitive Actor type, but we can define one very simply using a Queue inside of a subclassed Greenlet.
Gevent没有一个基本的参与者类型,但是我们可以非常简单地使用子类Greenlet中的队列来定义一个。
import gevent
class Actor(gevent.Greenlet):
def __init__(self):
self.inbox = queue.Queue()
Greenlet.__init__(self)
def receive(self, message):
"""
Define in your subclass.
"""
raise NotImplemented()
def _run(self):
self.running = True
while self.running:
message = self.inbox.get()
self.receive(message)
In a use case:
import gevent
from gevent.queue import Queue
from gevent import Greenlet
class Pinger(Actor):
def receive(self, message):
print message
pong.inbox.put('ping')
gevent.sleep(0)
class Ponger(Actor):
def receive(self, message):
print message
ping.inbox.put('pong')
gevent.sleep(0)
ping = Pinger()
pong = Ponger()
ping.start()
pong.start()
ping.inbox.put('start')
gevent.joinall([ping, pong])