超详细，GraphRAG（最新版）+Ollama本地部署，以及中英文示例超详细，GraphRAG+Ollama本地部署，

超详细，GraphRAG+Ollama本地部署，以及中英文示例

随着GraphRAG的更新，之前CSDN上的GraphRAG+Ollama部署教程（GraphRAG+Ollama实现本地部署（最全，非常详细，保姆教程）_graphrag ollama-CSDN博客）可能已经不太适合了，因为该教程是自己根据老版本的GraphRAG源码修改的，缺失了很多新版本的功能（例如提示词的自动生成）。而直接使用官方的GraphRAG，该教程在Ollama的部署上又有一些关键的细节点没有提到，导致GraphRAG和Ollama不兼容，故出此教程，希望对本地部署GraphRAG的小伙伴有帮助。

1.GraphRAG环境配置

1.1 Anaconda下载及配置

有几个关键点：

安装，见Anaconda保姆级安装配置教程（新手必看）_anaconda 安装环境-CSDN博客
安装后配置清华源，详见anaconda | 镜像站使用帮助 | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
pip也要配置清华源，详见pypi | 镜像站使用帮助 | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
（可选）把环境的安装目录从默认的C盘修改到自定义的目录，详见修改conda默认envs_dirs和pkgs_dirs_conda 修改envs directories-CSDN博客

1.2 安装CUDA

这一步不清楚是否是必要的，但我是有CUDA环境的，自己下载安装即可。

注意最小化安装，即只选择CUDA下的Development、Runtime和Documentation，别的我们用不到（如果想更新自己电脑的显卡驱动，也可选择Driver Components）

1.3 创建虚拟环境及GraphRAG安装

使用安装后在开始菜单生成的Anaconda Prompt，执行：

(base) C:\Users\Antares>conda create -n graphrag python=3.10

这一步会在你之前指定的环境目录下生成一个名为graphrag的python 3.10环境

# 激活环境
(base) C:\Users\Antares>conda activate graphrag
# 安装graphrag 0.5.0（之后的版本可能和ollama有兼容问题，0.5.0我长期使用确定没问题）
(graphrag) C:\Users\Antares>pip install graphrag==0.5.0

# 简单了解graphrag的指令
(graphrag) C:\Users\Antares>graphrag --help

2. Ollama环境配置

2.1 安装Ollama

没什么好说的，官网下载安装即可。

（可选）修改模型保存目录，默认是保存在C盘的，详见【ollama】linux、window系统更改模型存放位置，全网首发2024！_ollama修改模型位置-CSDN博客，注意一定要重启ollama才生效。重启之前，依然会保存到C盘

（Ollama的指令都在Powershell中执行，之后会略掉PS C:\Users\Antares\Desktop>）

# 自己了解一下ollama的指令，和docker类似
PS C:\Users\Antares\Desktop>ollama --help

2.2 下载开源的LLM和Embeddings Model

ollama pull mistral
ollama pull nomic-embed-text

# 看一下两个是否都下载下来了
ollama list

3.GraphRAG的使用

自己创建一个目录，例如，我把GraphRAG的示例项目放在：E:\Workplace\GraphPro\graphrag，打开Powershell

PS C:\Users\Antares> cd E:\Workplace\GraphPro\graphrag
PS E:\Workplace\GraphPro\graphrag> mkdir -p ./ragtest/input

以下所有的测试在./ragtest目录下进行，./ragtest/input下放置我们要处理的文本

3.1 模型配置修改

在进行测试之前，首先需要修改模型配置。因为ollama默认把模型的上下文长度设置为2048，是一个很小的值，后面使用GraphRAG很可能会超出这个长度，造成超出的部分被截掉，引发各种奇怪的问题。所以，这里还需要对模型的上下文长度进行修改：

# 把mistral默认的配置文件拉出来
PS E:\Workplace\GraphPro\graphrag> ollama show --modelfile mistral:latest > Modelfile

看一下这里的Modefile

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM mistral:latest

FROM D:\Ollama\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435
TEMPLATE """{{- if .Messages }}
{{- range $index, $_ := .Messages }}
{{- if eq .Role "user" }}
{{- if and (eq (len (slice $.Messages $index)) 1) $.Tools }}[AVAILABLE_TOOLS] {{ $.Tools }}[/AVAILABLE_TOOLS]
{{- end }}[INST] {{ if and $.System (eq (len (slice $.Messages $index)) 1) }}{{ $.System }}

{{ end }}{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{- if .Content }} {{ .Content }}
{{- else if .ToolCalls }}[TOOL_CALLS] [
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{- end }}]
{{- end }}</s>
{{- else if eq .Role "tool" }}[TOOL_RESULTS] {"content": {{ .Content }}} [/TOOL_RESULTS]
{{- end }}
{{- end }}
{{- else }}[INST] {{ if .System }}{{ .System }}

{{ end }}{{ .Prompt }}[/INST]
{{- end }} {{ .Response }}
{{- if .Response }}</s>
{{- end }}"""
PARAMETER stop [INST]
PARAMETER stop [/INST]

# ...

我们需要在PARAMETER这里添加配置（注意保存文件哦）。这里自己酌情设置，更大的num_ctx可能造成更大的显存占用：

PARAMETER num_ctx 10000

然后创建新的模型：

PS E:\Workplace\GraphPro\graphrag> ollama create mistral:10k -f Modelfile   

# 会发现多了一个mistral:10k模型
PS E:\Workplace\GraphPro\graphrag> ollama list

# 运行ollama(如果你没运行的话。控制台运行会打印日志，方便找BUG，否则保存到日志文件中)
PS E:\Workplace\GraphPro\graphrag> ollama serve

3.2 英文示例

所有示例在Anaconda Prompt的graphrag环境下进行，目录是E:\Workplace\GraphPro\graphrag，也就是说，你的控制台应该类似下面这样：

(graphrag) E:\Workplace\GraphPro\graphrag>

3.2.1 准备英文文本

Introduction to Transformer Neural Networks
Transformer neural networks represent a revolutionary architecture in the field of deep learning, particularly for natural language processing (NLP) tasks. Introduced in the seminal paper "Attention is All You Need" by Vaswani et al. in 2017, transformers have since become the backbone of numerous state-of-the-art models due to their ability to handle long-range dependencies and parallelize training processes. Unlike traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs), transformers rely entirely on a mechanism called self-attention to process input data. This mechanism allows transformers to weigh the importance of different words in a sentence or elements in a sequence simultaneously, thus capturing context more effectively and efficiently.

Architecture of Transformers
The core component of the transformer architecture is the self-attention mechanism, which enables the model to focus on different parts of the input sequence when producing an output. The transformer consists of an encoder and a decoder, each made up of a stack of identical layers. The encoder processes the input sequence and generates a set of attention-weighted vectors, while the decoder uses these vectors, along with the previously generated outputs, to produce the final sequence. Each layer in the encoder and decoder contains sub-layers, including multi-head self-attention mechanisms and position-wise fully connected feed-forward networks, followed by layer normalization and residual connections. This design allows the transformer to process entire sequences at once rather than step-by-step, making it highly parallelizable and efficient for training on large datasets.

Applications of Transformer Neural Networks
Transformers have revolutionized various applications across different domains. In NLP, they power models like BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pre-trained Transformer), and T5 (Text-to-Text Transfer Transformer), which excel in tasks such as text classification, machine translation, question answering, and text generation. Beyond NLP, transformers have also shown remarkable performance in computer vision with models like Vision Transformer (ViT), which treats images as sequences of patches, similar to words in a sentence. Additionally, transformers are being explored in areas such as speech recognition, protein folding, and reinforcement learning, demonstrating their versatility and robustness in handling diverse types of data. The ability to process long-range dependencies and capture intricate patterns has made transformers indispensable in advancing the state of the art in many machine learning tasks.

Challenges and Limitations
Despite their success, transformer neural networks come with several challenges and limitations. One of the primary concerns is their computational and memory requirements, which are significantly higher compared to traditional models. The quadratic complexity of the self-attention mechanism with respect to the input sequence length can lead to inefficiencies, especially when dealing with very long sequences. To mitigate this, various approaches like sparse attention and efficient transformers have been proposed. Another challenge is the interpretability of transformers, as the attention mechanisms, though providing some insights, do not fully explain the model's decisions. Furthermore, transformers require large amounts of data and computational resources for training, which can be a barrier for smaller organizations or those with limited resources. Addressing these challenges is crucial for making transformers more accessible and scalable for a broader range of applications.

Future Directions
The future of transformer neural networks is bright, with ongoing research focused on enhancing their efficiency, scalability, and applicability. One promising direction is the development of more efficient transformer architectures that reduce computational complexity and memory usage, such as the Reformer, Linformer, and Longformer. These models aim to make transformers feasible for longer sequences and real-time applications. Another important area is improving the interpretability of transformers, with efforts to develop methods that provide clearer explanations of their decision-making processes. Additionally, integrating transformers with other neural network architectures, such as combining them with convolutional networks for multimodal tasks, holds significant potential. The application of transformers beyond traditional domains, like in time-series forecasting, healthcare, and finance, is also expected to grow. As advancements continue, transformers are set to remain at the forefront of AI and machine learning, driving innovation and breakthroughs across various fields.

3.2.2 初始化项目并修改配置文件

将上面的一段英文示例文本保存到./ragtest/input下的Transformers_intro.txt，这里的文件名是随意的，txt文本就好。首先执行初始化：

graphrag init --root ./ragtest

这一步会生成

结合本地大模型部署，你需要修改settings.yaml。（当然你可以调用OpenAI的接口，更方便且模型也更强大，但是收费且需要有外国手机号，详见：Getting Started - GraphRAG (microsoft.github.io)和OpenAI API | OpenAI。下面只说结合本地大模型的部署。

encoding_model: cl100k_base # this needs to be matched to your model!

llm:
  api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
  type: openai_chat # or azure_openai_chat
  model: mistral:10k
  model_supports_json: true # recommended if this is available for your model.
  # audience: "https://cognitiveservices.azure.com/.default"
  api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>

parallelization:
  stagger: 0.3
  # num_threads: 50

async_mode: threaded # or asyncio

embeddings:
  async_mode: threaded # or asyncio
  vector_store:
    type: lancedb
    db_uri: 'output\lancedb'
    container_name: default
    overwrite: true
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text
    api_base: http://localhost:11434/v1
    # api_version: 2024-02-15-preview
    # audience: "https://cognitiveservices.azure.com/.default"
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>

你需要修改llm和embeddings的配置。llm.model=mistral:10k，llm.api_base=http://localhost:11434/v1，embeddings.llm.model=nomic-embed-text，embeddings.llm.api_base=http://localhost:11434/v1（注意这里的两个接口都是/v1结尾，而不是/api，后者与OpenAI官方的嵌入模型接口标准不符）。

3.2.3 构建索引

graphrag index --root ./ragtest

当看到All workflows completed successfully时，就证明大功告成了！这个过程可能要等待2-4分钟

构建的结果保存在./ragtest/output下，其中的parquet文件可经过进一步处理导入Neo4J中，展示可视化的图谱，这就不在本教程中了。

3.2.4 提问

首先进行全局提问：

graphrag query --root ./ragtest --method global --query "What is the main idea of this text?"

再来一个局部提问：

graphrag query --root ./ragtest --method local --query "What is the relationship between transformers and traditional RNNs?"

3.3 中文示例

3.3.1 准备中文文本

机器学习是通过数据学习模式和规律，从而使计算机能够在没有显式编程的情况下完成特定任务的一门技术。其核心思想是利用数据构建模型，使其能够对新数据进行预测或决策。机器学习根据不同的任务目标和学习方式可以分为监督学习、无监督学习、半监督学习、强化学习以及自监督学习等几个主要类别。监督学习是最常见的一种，它依赖于带标签的数据集，通过输入和输出的映射关系训练模型，典型任务包括回归问题和分类问题。回归问题的目标是预测连续数值，例如通过历史房价数据预测未来房价；而分类问题则用于预测离散类别，例如垃圾邮件检测。无监督学习则使用没有标签的数据集，其目标是挖掘数据的内在结构。无监督学习的主要方法有聚类和降维，其中聚类是将数据分组，例如在客户分析中对不同类型客户进行分群；降维则是通过降低数据维度的方式进行特征提取或数据可视化，如主成分分析（PCA）常用于这种场景。半监督学习结合了监督学习和无监督学习的优势，通过少量有标签的数据和大量无标签的数据来训练模型，适用于获取标签数据成本较高的情况。强化学习通过与环境的交互不断优化策略，其关键在于通过奖励和惩罚信号来学习最优的行为策略，常见的应用包括游戏AI和机器人控制。自监督学习是一种近年来备受关注的技术，利用数据本身生成伪标签，从而在无标签数据上进行预训练，广泛应用于自然语言处理和计算机视觉任务。

在具体方法上，机器学习包含从基础到先进的一系列技术。在线性回归中，模型通过拟合输入特征和目标变量之间的线性关系来进行预测，通常使用最小二乘法来优化参数。逻辑回归则是用于分类任务的一种技术，它通过Sigmoid函数将输入特征映射到概率空间，从而实现二分类问题的解决。支持向量机是一种既可以用于分类也可以用于回归的技术，通过寻找最优超平面将数据分类，并最大化不同类别之间的间隔，其核心是对数据点进行映射以使非线性问题变得线性可分。决策树方法通过递归地划分特征空间来构建树形结构，从而完成分类或回归任务；然而，由于决策树容易过拟合，因此随机森林通过组合多棵决策树进行集成，从而提高了模型的稳定性和准确性。梯度提升方法如XGBoost和LightGBM在建模时通过逐步修正残差来优化模型，成为许多比赛和生产环境中的首选算法。

神经网络是机器学习中的一类重要技术，特别是深度学习的兴起使其成为研究的热点。神经网络的基本单元是神经元，多个神经元通过层的形式组合成网络结构。卷积神经网络（CNN）专注于提取图像中的局部特征，适合图像分类、目标检测等任务；循环神经网络（RNN）则专注于处理时间序列数据，通过其隐状态捕获序列中的上下文关系，用于自然语言处理和语音识别。然而，传统的RNN存在梯度消失问题，因此改进的长短期记忆网络（LSTM）和门控循环单元（GRU）得以广泛应用。近年来，Transformer架构由于其并行计算能力和对长距离依赖建模的优势，取代了RNN在许多任务中的地位，成为自然语言处理和其他领域的主流技术，例如BERT和GPT模型的成功就基于这种架构。此外，图神经网络（GNN）是处理图结构数据的一种强大工具，它通过节点间的消息传递机制学习图中节点的特征表示，被应用于社交网络分析、推荐系统和生物信息学等领域。

机器学习方法的成功还依赖于优化算法和特征工程。优化算法是模型训练的关键，通过不断调整参数以最小化损失函数。例如，梯度下降法及其变种如Adam优化器在现代机器学习中被广泛使用。特征工程则包括特征选择和特征提取，前者通过统计方法或算法选择最具信息量的特征，后者通过方法如PCA或自动编码器将数据转换为更易于学习的形式。

机器学习在许多实际场景中发挥着重要作用。在自然语言处理领域，文本分类技术被用于垃圾邮件过滤和情感分析，机器翻译技术支撑了多语言交流，聊天机器人则改进了客户服务效率。在计算机视觉领域，图像分类和目标检测已经应用于人脸识别、自动驾驶和医疗影像分析等任务。推荐系统结合协同过滤和基于内容的推荐模型，能够在电商、视频和音乐平台上提供个性化推荐。在金融科技中，机器学习被用于信用评分、欺诈检测和市场预测，显著提高了金融服务的效率与安全性。在医疗健康领域，机器学习通过电子病历分析和医学影像处理帮助医生进行疾病诊断和预测。工业与物联网应用中，机器学习被用于设备的故障预测和优化控制，例如通过预测性维护延长机器寿命。强化学习也在游戏AI、机器人路径规划和自动化系统中展现了强大的能力。此外，在科学研究中，机器学习加速了新药开发、基因分析和天文学研究的进展。

通过技术的不断演进，机器学习的应用范围还在持续扩展，其对各行业的深刻影响仍在进一步深化。

清除掉./ragtest目录下的生成文件，只保留input，prompt，.env和settings.yaml，这是为了防止之前英文示例的缓存对中文示例产生影响。

3.3.2 生成提示词

默认的提示词是针对英文的，所以首先需要调整提示词，这里使用graphrag提供的自动化prompt调整（graphrag prompt-tune其他的参数可以通过graphrag prompt-tune --help查看）：

graphrag prompt-tune --root .\ragtest --domain 计算机 --language 中文 --chunk-size 500 --output .\prompt-cs

这会在.\prompt-cs下生成针对计算机领域、中文文本的提示词。接下来把新生成的提示词覆盖.\ragtest\prompt下默认的提示词，然后构建索引、提问（同上）。