Langchain（八）进阶之缓存处理提高效率内存缓存内存缓存适合于短暂的缓存需求，当内存达到一定限制时，缓存将被删除

引言：本文介绍了如何实现LLM（基于语言的模型）缓存结果。对于大量的重复需求，其实我们可以通过缓存来提高效率，因为某些LLM请求需要花费大量时间进行处理。缓存可以使用内存缓存、SQLite缓存、Redis缓存和自定义的SQLAlchemy缓存。接下来我们逐一介绍各种缓存方式。

内存缓存

内存缓存适合于短暂的缓存需求，当内存达到一定限制时，缓存将被删除。使用内存缓存的好处是速度非常快，缓存可以在很短时间内访问到。以下使用了InMemoryCache来实现内存缓存。

import langchain
from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

SQLite缓存

当缓存过大时，SQLite是一种更好的缓存选择，对于可能需要延长持久性的缓存数据，SQLite能够提供更好的支持。以下使用了SQLiteCache来实现SQLite缓存。

# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

Redis缓存

Redis提供了内存缓存和持久化缓存。Redis的启动非常快速，并且可以存储大量数据。以下使用了RedisCache来实现Redis缓存。

# We can do the same thing with a Redis cache
# (make sure your local Redis instance is running first before running this example)
from redis import Redis
from langchain.cache import RedisCache

langchain.llm_cache = RedisCache(redis_=Redis())

基于语义的缓存

基于语义的缓存可以根据已缓存文本的语义与新要缓存的文本语义进行比较，以确定是否缓存数据。以下使用了RedisSemanticCache来实现基于语义的缓存。

from langchain.embeddings import OpenAIEmbeddings
from langchain.cache import RedisSemanticCache
langchain.llm_cache = RedisSemanticCache(
    redis_url="redis://localhost:6379",
    embedding=OpenAIEmbeddings()
)

GPTCache

GPTCache是一种基于GPT语言模型的缓存。在GPTCache中，可以使用精确匹配缓存或语义相似性缓存来缓存结果。以下介绍了如何使用GPTCache来实现精确匹配缓存和语义相似性缓存。

#先从精确匹配的例子说起
from gptcache import Cache
from gptcache.manager.factory import manager_factory
from gptcache.processor.pre import get_prompt
from langchain.cache import GPTCache

# Avoid multiple caches using the same file, causing different llm model caches to affect each other

def init_gptcache(cache_obj: Cache, llm str):
    cache_obj.init(
        pre_embedding_func=get_prompt,
        data_manager=manager_factory(manager="map", data_dir=f"map_cache_{llm}"),
    )

langchain.llm_cache = GPTCache(init_gptcache)

#现在让我们展示一个相似性缓存的例子
from gptcache import Cache
from gptcache.adapter.api import init_similar_cache
from langchain.cache import GPTCache

# Avoid multiple caches using the same file, causing different llm model caches to affect each other

def init_gptcache(cache_obj: Cache, llm str):
    init_similar_cache(cache_obj=cache_obj, data_dir=f"similar_cache_{llm}")

langchain.llm_cache = GPTCache(init_gptcache)

SQLAlchemy缓存

SQLAlchemy是一个用于Python的SQL工具包和对象关系映射器，可以用于访问多个数据库系统。以下使用SQLAlchemyCache来缓存任何SQLAlchemy支持的SQL数据库。

from langchain.cache import SQLAlchemyCache
from sqlalchemy import create_engine

engine = create_engine("postgresql://postgres:postgres@localhost:5432/postgres")
langchain.llm_cache = SQLAlchemyCache(engine)

关闭缓存

在某些情况下，可能需要关闭特定的LLM缓存。这可以通过在实例化LLM时使用“cache=False”来实现。这种情况下，该LLM实例将不使用任何缓存。

llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2, cache=False)

链式过程中关闭缓存

在链式过程中，有时需要关闭某些节点的缓存。在这种情况下，可以先构建链式过程，然后再编辑LLM。

总之，使用缓存可以在一定程度上提高程序的效率，使得代码运行更加快速和流畅。其中，内存缓存适用于短暂的缓存需求，SQLite缓存和Redis缓存适用于需要延长持久性的缓存数据，而基于语义的缓存和GPTCache可以根据文本的语义与新要缓存的文本语义进行比较，从而实现更加智能的结果缓存。此外，还可以使用SQLAlchemyCache来缓存任何SQLAlchemy支持的SQL数据库。

今天就到这里，明天继续连载。