使用自定义大模型运行一个简单的Graph RAG Demo

说明：

首次发表日期：2024-07-15
官方教程： microsoft.github.io/graphrag/po…
官方Github仓库： github.com/microsoft/g…

准备环境和配置

首先创建Python运行环境

conda create --name graphrag python=3.10
conda activcate graphrag
pip install graphrag

然后初始化一个名为ragtest的graph rag项目

python -m graphrag.index --init --root ./ragtest

此命令在当前目录下创建了一个ragtest文件夹（项目）

将文本文件，比如xxx.txt放入ragtext/input文件夹中，作为提供给Graph RAG的输入文本。

修改settings.yaml配置，以使用自己的和open ai兼容的大模型服务，以下是关于大模型和embedding模型的配置部分：

llm:
  api_key: sk-xxxxxxxxxxxxxxxxxx
  type: openai_chat # or azure_openai_chat
  model: Qwen2-72B-Instruct
  api_base: https://xxxxxxxxxxxxxxx.com/v1
  model_supports_json: false # recommended if this is available for your model.
  concurrent_requests: 25 # the number of parallel inflight requests that may be made

embeddings:
  llm:
    api_key: sk-xxxxxxxxxxxxxxx
    type: openai_embedding # or azure_openai_embedding
    model: bge-m3
    api_base: https://xxxxxxxxxxxxxxxxxxxxxxxx.com/v1
    concurrent_requests: 25 # the number of parallel inflight requests that may be made

构建索引并查询

然后构建索引：

python -m graphrag.index -v --root ./ragtest

最后，查询Graph RAG一个问题：

python -m graphrag.query --root ./ragtest --method global "大语言模型将如何改变这个世界？"

总结

个人感觉Graph RAG对大模型能力的要求比较高，建议使用和OPEN AI对标的模型