LangChain-02 JsonOutputParser 将LLM输出的结果自动转换为 JSON 格式 astream流式输出

279 阅读2分钟

安装依赖

pip install --upgrade --quiet  langchain-core langchain-community langchain-openai

环境变量

此为我的环境变量,你也需要配置你的 OPEN_API_KEY

export OPENAI_API_KEY="sk-UkyBxxxx"
export OPENAI_API_BASE="https://wzk....."

编写代码

from langchain_core.output_parsers import JsonOutputParser
from langchain_openai.chat_models import ChatOpenAI

async def main():
    model = ChatOpenAI(
            model="gpt-3.5-turbo",
        )
    chain = (
        model | JsonOutputParser()
    )  # Due to a bug in older versions of Langchain, JsonOutputParser did not stream results from some models
    async for text in chain.astream(
        'output a list of the countries france, spain and japan and their populations in JSON format. Use a dict with an outer key of "countries" which contains a list of countries. Each country should have the key `name` and `population`'
    ):
        print(text, flush=True)


if __name__ == '__main__':
    import asyncio
    asyncio.run(main())

测试代码

➜ python3 test04.py
{}
{'countries': []}
{'countries': [{}]}
{'countries': [{'name': ''}]}
{'countries': [{'name': 'France'}]}
{'countries': [{'name': 'France', 'population': 670}]}
{'countries': [{'name': 'France', 'population': 670760}]}
{'countries': [{'name': 'France', 'population': 67076000}]}
{'countries': [{'name': 'France', 'population': 67076000}, {}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': ''}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain'}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 467}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 467330}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {'name': ''}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {'name': 'Japan'}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {'name': 'Japan', 'population': 126}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {'name': 'Japan', 'population': 126476}]}
{'countries': [{'name': 'France', 'population': 67076000}, {'name': 'Spain', 'population': 46733038}, {'name': 'Japan', 'population': 126476458}]}

image.png

提取JSON

def _extract_country_names(inputs):
    """A function that does not operates on input streams and breaks streaming."""
    if not isinstance(inputs, dict):
        return ""

    if "countries" not in inputs:
        return ""

    countries = inputs["countries"]

    if not isinstance(countries, list):
        return ""

    country_names = [
        country.get("name") for country in countries if isinstance(country, dict)
    ]
    return country_names

Chain链接

使用Chain将内容连起来

# 上文写的是:model | JsonOutputParser(),这里修改为:
chain = model | JsonOutputParser() | _extract_country_names

运行结果

image.png