python 缓存缓存是一种将定量数据加以保存以备迎合后续请求的处理方式，旨在加快数据的检索速度。 contains,

缓存的目的

缓存是一种将定量数据加以保存以备迎合后续请求的处理方式，旨在加快数据的检索速度。

简单实现自己的一个缓存类

import datetime
import pprint
import random

class MyCache(object):
    def __init__(self):
        self.cache = {}
        self.max_cache_size = 10

    def __contains__(self, key):
        """
        判断键是否存在于缓存中
        实现这个魔法方法 是为了在实例化之后检查 key 是否在缓存实例中
        :param key:
        :return:
        """
        return key in self.cache

    def update(self, key, value):
        """
        更新缓存字典 并且选择性删除最早的条目
        :param key:
        :param value:
        :return:
        """
        if key not in self.cache and len(self.cache) >= self.max_cache_size:
            self.remove_oldest()
        self.cache[key] = {"date_accessed": datetime.datetime.now(), "value": value}

    def remove_oldest(self):
        """
        删除最早访问时间的输入数据
        :return:
        """
        oldest_entry = None
        for key in self.cache:
            if not oldest_entry:
                oldest_entry = key
            elif self.cache[key]["date_accessed"] < self.cache[oldest_entry]['date_accessed']:
                oldest_entry = key
        self.cache.pop(oldest_entry)

    @property
    def size(self):
        """
        缓存容量
        :return:
        """
        return len(self.cache)

contains, 虽然在这里并不一定要使用该方法，但其基本思路在于允许我们检查该类实例，从而了解其中是否包含有我们正在寻找的键。
另外，update方法负责利用新的键/值对进行缓存字典更新。一旦达到或者超出缓存最大容量，其还会删除日期最早的输入数据。
另外，remove_oldest方法负责具体的字典内早期数据删除工作。
最后，我们还引入了名为size的属性，其能够返回缓存的具体容量。

在运行这段代码之后，大家会注意到当缓存被占满时，其会删除时间最早的条目。不过以上示例代码并没有提到如何更新访问日期，即访问某一条数据的时候将时间设置为最新。

进行测试：

if __name__ == "__main__":
    keys = ["test", "red", "fox", "fence", "junk",
            "other", "alpha", "bravo", "cal", "devo",
            "ele"]

    s = "abcdefghijklmnop"
    cache = MyCache()
    for i, key in enumerate(keys):
        if key in cache:
            continue
        else:
            value = "".join(random.choice(s) for j in range(20))
            cache.update(key, value)
        print(f"{i+1}s iterations, {cache.size} cached entries")
        print()
    print(pprint.pformat(cache.cache))
    print("test" in cache)   # __contains__ 实现的效果 
    print("cal" in cache)

使用 lru_cache 装饰器

import time
import urllib.error
import urllib.request
from functools import lru_cache

@lru_cache(maxsize=24)
def get_webpage(module):
    """
    获取特定Python模块网络页面
    """
    webpage = "https://docs.python.org/3/library/{}.html".format(module)
    try:
        with urllib.request.urlopen(webpage) as request:
            return request.read()
    except urllib.error.HTTPError:
        return None


if __name__ == '__main__':
    t1 = time.time()
    modules = ['functools', 'collections', 'os', 'sys']
    for module in modules:
        page = get_webpage(module)
        if page:
            print("{} module page found".format(module))
    t2 = time.time()
    for m in modules:
        page = get_webpage(m)
        if page:
            print(f"{m} get again ...")
    t3 = time.time()

    print(t2-t1)
    print(t3-t2)
    print((t2-t1) / (t3-t2))

我们利用lru_cache对get_webpage函数进行了装饰，并将其最大尺寸设置为24条调用。在此之后，我们设置了一条网页字符串变量，并将其传递至我们希望函数获取的模块当中。如此一来，我们就能够针对该函数运行多次循环。可以看到在首次运行上述代码时，输出结果的显示速度相对比较慢。但如果大家在同一会话中再次加以运行，那么其显示速度将极大加快——这意味着lru_cache已经正确对该调用进行了缓存处理。

另外，我们还可以将一条typed参数传递至该装饰器。其属于一条Boolean，旨在通知该装饰器在typed为设定为True时对不同类型参数进行分别缓存。

使用 cachetools 模块

代码来源： www.thepythoncorner.com/2018/04/how…

原文讲了如何使用缓存来加速你的 python 程序，举出以下两个例子：在未使用缓存时：

import time
import datetime


def get_candy_price(candy_id):
    # let's use a sleep to simulate the time your function spends trying to connect to
    # the web service, 5 seconds will be enough.
    time.sleep(5)

    # let's pretend that the price returned by the web service is $1 for candies with a
    # odd candy_id and $1,5 for candies with a even candy_id

    price = 1.5 if candy_id % 2 == 0 else 1

    return (datetime.datetime.now().strftime("%c"), price)


# now, let's simulate 20 customers in your show.
# They are asking for candy with id 2 and candy with id 3...
for i in range(0, 20):
    print(get_candy_price(2))
    print(get_candy_price(3))

在适应了缓存之后：

import time
import datetime

from cachetools import cached, TTLCache  # 1 - let's import the "cached" decorator and the "TTLCache" object from cachetools
cache = TTLCache(maxsize=100, ttl=300)  # 2 - let's create the cache object.


@cached(cache)  # 3 - it's time to decorate the method to use our cache system!
def get_candy_price(candy_id):
    # let's use a sleep to simulate the time your function spends trying to connect to
    # the web service, 5 seconds will be enough.
    time.sleep(5)

    # let's pretend that the price returned by the web service is $1 for candies with a
    # odd candy_id and $1,5 for candies with a even candy_id

    price = 1.5 if candy_id % 2 == 0 else 1

    return (datetime.datetime.now().strftime("%c"), price)


# now, let's simulate 20 customers in your show.
# They are asking for candy with id 2 and candy with id 3...
for i in range(0, 20):
    print(get_candy_price(2))
    print(get_candy_price(3))

这里不再展示运行结果，可以自行 copy 运行。

多级缓存

以上缓存的思路大同小异，但是并不能解决我的问题。我想按照多个条件去设置和缓存。类似于将缓存当做一个简易的数据库去查询，而不仅仅是简单的键值对的形式。找到了一个 cacheout 模块，尝试去实现自己想要的功能。

cacheout 使用

链接

github.com/dgilland/ca… cacheout.readthedocs.io/en/latest/m…

简介

这是一个 python 缓存库。

特点

In-memory caching using dictionary backend
Cache manager for easily accessing multiple cache objects
Reconfigurable cache settings for runtime setup when using module-level cache objects
Maximum cache size enforcement
Default cache TTL (time-to-live) as well as custom TTLs per cache entry
Bulk set, get, and delete operations
Bulk get and delete operations filtered by string, regex, or function
Memoization decorators
Thread safe
Multiple cache implementations:
- FIFO (First In, First Out)
- LIFO (Last In, First Out)
- LRU (Least Recently Used)
- MRU (Most Recently Used)
- LFU (Least Frequently Used)
- RR (Random Replacement)

简单翻译下：

使用字典后端的内存缓存
缓存管理器，用于轻松访问多个缓存对象
使用模块级缓存对象时，运行时设置的可重新配置缓存设置
最大缓存大小实施
默认缓存TTL（生存时间）以及每个缓存条目的自定义TTL
批量设置，获取和删除操作
批量获取和删除由字符串，正则表达式或函数过滤的操作
记忆装饰
线程安全
多个缓存实现：
- FIFO（先进先出）
- LIFO（后进先出）
- LRU（最近最少使用）
- MRU（最近使用）
- LFU（最不常用）
- RR（随机替换）

路线图

Roadmap

Layered caching (multi-level caching)
Cache event listener support (e.g. on-get, on-set, on-delete)
Cache statistics (e.g. cache hits/misses, cache frequency, etc)

路线图

分层缓存（多级缓存）
缓存事件监听器支持（例如on-get，on-set，on-delete）
缓存统计信息（例如缓存命中/未命中，缓存频率等）

安装

pip install cacheout

依赖

Python >= 3.4

简单使用

创建一个缓存对象：

# start with some basic caching by creating a cache object:
from cacheout import Cache
cache = Cache()

默认有 256 的缓存个数以及不设置过期时间： cache = Cache() 等价于：

# By default the cache object will have a maximum size of 256 and default TTL expiration turned off. These values can be set with:
cache = Cache(maxsize=256, ttl=0, timer=time.time, default=None)  # defaults

设置值：

# Set a cache key using cache.set():
cache.set(1, 'foobar')

获取值：

# Get the value of a cache key with cache.get():
assert cache.get(1) == 'foobar'

设置一个在没有获取到值的时候拿到的默认值：

# Get a default value when cache key isn't set:
assertcache.get(2) is None
assert cache.get(2, default=False) is False
assert 2 not in cache

但是这个值并没有被设置进入缓存。

设置一个全局的默认值：

# Provide a global default:
cache2 = Cache(default=True)
assert cache2.get('missing') is True
assert 'missing' not in cache2

cache3 = Cache(default=lambda key: key)
assert cache3.get('missing') == 'missing'
# missing 被设置进入缓存
assert 'missing' in cache3

设置缓存的过期时间：

# Set the TTL (time-to-live) expiration per entry:
cache.set(3, {'data': {}}, ttl=1)
assert cache.get(3) == {'data': {}}
time.sleep(1)
assert cache.get(3) is None

缓存函数的结果：

# Memoize a function where cache keys are generated from the called function parameters:
@cache.memoize()
def func(a, b):
    return a + b 

# Provide a TTL for the memoized function and incorporate argument types into generated cache keys:
@cache.memoize(ttl=5, typed=True)
def func(a, b):
    print("--- into --- func ---")
    return a + b

# func(1, 2) has different cache key than func(1.0, 2.0), whereas,
# with "typed=False" (the default), they would have the same key

print(func(1, 2))
print(func(1, 2))
print(func.uncached(1, 2))  # 访问原始的memoized功能
print(func(1, 2))

获取一份缓存的拷贝

# Get a copy of the entire cache with cache.copy():
assert cache.copy() == {1: 'foobar', 2: ('foo', 'bar', 'baz')}

删除缓存中的某个值

# Delete a cache key with cache.delete():
cache.delete(1)
assert cache.get(1) is None

清空整个缓存

# Clear the entire cache with cache.clear():
cache.clear()
assert len(cache) == 0

缓存的批量设置获取以及删除

# Perform bulk operations with cache.set_many(), cache.get_many(), and cache.delete_many():
cache.set_many({'a': 1, 'b': 2, 'c': 3})
assert cache.get_many(['a', 'b', 'c']) == {'a': 1, 'b': 2, 'c': 3}
cache.delete_many(['a', 'b', 'c'])
assert cache.count() == 0

批量获取和删除时的匹配问题

# Use complex filtering in cache.get_many() and cache.delete_many():

import re
cache.set_many({'a_1': 1, 'a_2': 2, '123': 3, 'b': 4})

cache.get_many('a_*') == {'a_1': 1, 'a_2': 2}
cache.get_many(re.compile(r'\d')) == {'123': 3}
cache.get_many(lambda key: '2' in key) == {'a_2': 2, '123': 3}

cache.delete_many('a_*')
assert dict(cache.items()) == {'123': 3, 'b': 4}

在创建之后重新配置缓存对象

# Reconfigure the cache object after creation with cache.configure():
cache.configure(maxsize=1000, ttl=5 * 60)

像字典一样去获取缓存的键值键值对

# Get keys, values, and items from the cache with cache.keys() cache.values(), and cache.items():

cache.set_many({'a': 1, 'b': 2, 'c': 3})
assert list(cache.keys()) == ['a', 'b', 'c']
assert list(cache.values()) == [1, 2, 3]
assert list(cache.items()) == [('a', 1), ('b', 2), ('c', 3)]

遍历迭代缓存

# Iterate over cache keys:

for key in cache:
    print(key, cache.get(key))
    # 'a' 1
    # 'b' 2
    # 'c' 3

检查被缓存的键是否存在

# Check if key exists with cache.has() and key in cache:
assert cache.has('a')
assert 'a' in cache

使用CacheManager管理多级缓存

from cacheout import CacheManager

cacheman = CacheManager({'a': {'maxsize': 100},
                         'b': {'maxsize': 200, 'ttl': 900},
                         'c': {})

cacheman['a'].set('key1', 'value1')
value = cacheman['a'].get('key')

cacheman['b'].set('key2', 'value2')
assert cacheman['b'].maxsize == 200
assert cacheman['b'].ttl == 900

cacheman['c'].set('key3', 'value3')

cacheman.clear_all()
for name, cache in cacheman:
    assert name in cacheman
    assert len(cache) == 0

其中，最后讲到的多级缓存应该可以解决自己的问题，如图，如果我的接口存在股票类型和时间两个自变量，就可以将股票类型设置在一级缓存里面，将时间设置为二级缓存：

代码大致可以这么写：大致是： [图片]

之前的做法是想（1）将缓存放在类变量里面；（2）使用 redis 缓存。

python 缓存