Debug

“人脉工具”

项目目标： 帮助市场营销部门的员工找到微博上适合做鲜花推广的大V，并给出具体的联络方案。

项目的技术实现细节

第一步： 通过LangChain的搜索工具，帮助运营人员找到微博中有可能对相关鲜花推广感兴趣的大V（比如喜欢玫瑰花的大V），并返回UID。

第二步： 根据微博UID，通过爬虫工具拿到相关大V的微博公开信息，并以JSON格式返回大V的数据。

第三步： 通过LangChain调用LLM，通过LLM的总结整理以及生成功能，根据大V的个人信息，写一篇热情洋溢的介绍型文章，文章核心是谋求与该大V的合作。

第四步： 通过LangChain的outparser，生成可以嵌入提示模板的格式化数据结构。

第五步： 添加HTML、CSS，并用Flask创建一个App，在网络上部署及发布这个鲜花电商人脉工具，供市场营销部门的人员使用。

以上涉及到的LangChain技术包括提示工程、模型、链、代理、输出解析等。

部署一个鲜花网络电商的人脉工具(上)

第一步找到大V

第一步： 通过LangChain的搜索工具，以模糊搜索的方式，帮助运营人员找到微博中有可能对相关鲜花推广感兴趣的大V（比如喜欢玫瑰花的大V），并返回UID

报错

分析：search工具没有找到微博UID的结果，进而正则findall匹配报错。

在search_tools.py中的get_UID加断点debug。运行多次发现问题。（好在SerpApi免费额度中重复搜索的内容不额外计算次数）

以牡丹作为检索词查看搜索结果。

# 获取与某种鲜花相关的微博UID的函数
def get_UID(flower: str):
    # search = SerpAPIWrapper()
    search = CustomSerpAPIWrapper()
    res = search.run(f"{flower}")
    return res

if __name__ == '__main__':
    print(get_UID("牡丹"))

调用SerpApi检索结果存储的字典res（不是get_UID里我们定义的res) 有这些字段：

class CustomSerpAPIWrapper是通过多个if判断来存储搜索结果的，其中当前检索结果含有knowledge_graph和organic_results字段，且organic_results获取第0个：

        if "knowledge_graph" in res.keys():
            knowledge_graph = res["knowledge_graph"]
            title = knowledge_graph["title"] if "title" in knowledge_graph else ""
            if "description" in knowledge_graph.keys():
                snippets.append(knowledge_graph["description"])
            for key, value in knowledge_graph.items():
                if (
                    isinstance(key, str)
                    and isinstance(value, str)
                    and key not in ["title", "description"]
                    and not key.endswith("_stick")
                    and not key.endswith("_link")
                    and not value.startswith("http")
                ):
                    snippets.append(f"{title} {key}: {value}.")
        if "organic_results" in res.keys():
            first_organic_result = res["organic_results"][0]
            if "snippet" in first_organic_result.keys():
                # snippets.append(first_organic_result["snippet"])
                snippets.append(first_organic_result["link"])
            elif...

其中，res['organic_results']存储了我们需要的检索结果。、
- 但是可以看到前排信息基本没有微博相关的：
- 第[0]个结果
最终if判断筛选过后返回的检索结果

与agent运行的observation结果一致

解决方案

修改检索词，限定检索范围，使得返回的检索结果列表中的前排是站点为weibo的，则第0个必然是微博里牡丹花相关的。

修改get_UID如下：

# 获取与某种鲜花相关的微博UID的函数
def get_UID(flower: str):
    # search = SerpAPIWrapper()
    search = CustomSerpAPIWrapper()
    # res = search.run(f"{flower}")
    res = search.run(f"{flower} site:weibo.com")
    return res

debug测试
- 前排结果：
- 第0个的字段没有knowledge_graph，后期筛选的结果只有link
- 最终结果
  
  ['weibo.com/ttarticle/p…'] 发现不是用户界面，链接还不是https://weibo.com/u/的形式，最终还是会找不到用户。
进一步修改

res = search.run(f"{flower} site:weibo.com/u")

最终检索结果：

['weibo.com/u/306795307…']

写在最后

访问这个链接发现这个博主并不是大V，粉丝数并不多。现在这个检索方式google最前面的相关结果并不是我们最为理想的。

可以通过各种字段的内容找到粉丝数，在自定义继承自SerpAPIWrapper类的CustomSerpAPIWrapper类中，处理结果时选择筛选粉丝数较多的。

部署一个鲜花网络电商的人脉工具(下)

前置问题解决：（上）中遇到的问题更新解决到（下）的代码中
生成文案报错修改 textgen_tool.py中的模型调用

    # # 初始化大模型
    # llm = ChatOpenAI(model_name="gpt-3.5-turbo")
    import os
    llm = ChatOpenAI(
        model=os.environ["LLM_MODELEND"],
        temperature=0,
    )

app.py报错如下

print result 发现字段多了一层'properties'：

{'properties': {'summary':
解决方案修改语句：response = json.loads(response_str)

    # 处理请求的路由，仅允许POST请求
@app.route("/process", methods=["POST"])
def process():
    # 获取提交的花的名称
    flower = request.form["flower"]
    # 使用find_bigV函数获取相关数据
    response_str = find_bigV(flower=flower)
    # 使用json.loads将字符串解析为字典
    response = json.loads(response_str)['properties']

    # 返回数据的json响应
    return jsonify(
        {
            "summary": response["summary"],
            "facts": response["facts"],
            "interest": response["interest"],
            "letter": response["letter"],
        }
    )

最终完成整个环节人脉工具

注：未优化检索到的大V，普通博主信息较为缺乏

Debug | 部署一个鲜花网络电商的人脉工具