《Ollama API 调用全攻略：Python/Node.js 示例代码直接抄》掘金文章 - 第三章标题（选一个）

掘金文章 - 第三章

标题（选一个）

《Ollama API 调用全攻略：Python/Node.js 示例代码直接抄》 ⭐ 推荐
《本地 AI 也能编程！Ollama API 调用实战教程》
《打工人进阶：用 Python 调用本地大模型，效率翻倍》

封面图建议

代码编辑器截图（Python/Node.js 调用示例）
或者 API 调用流程图
或者科技/AI 类图片

正文内容

# Ollama API 调用全攻略：Python/Node.js 示例代码直接抄

> 掌握 Ollama API，让你的应用与本地 AI 对话
> 
> **本文是《Ollama 实战教程》第三章，免费试读** 🎁
> 
> 上一篇：[Ollama 模型选择指南](https://juejin.cn/spost/7614110147108716584)

---

## 一、Ollama API 基础

### API 地址

Ollama 提供了简洁的 REST API，让你可以在自己的应用中调用本地模型。

**默认地址：** `http://localhost:11434`

### 主要端点

| 端点 | 方法 | 说明 |
|------|------|------|
| `/api/generate` | POST | 生成文本（单次完成） |
| `/api/chat` | POST | 对话（支持多轮） |
| `/api/tags` | GET | 查看已安装的模型 |
| `/api/show` | POST | 查看模型详情 |

---

### 快速测试：用 curl 调用

#### 测试 1：查看已安装的模型

```bash
curl http://localhost:11434/api/tags

返回：

{
  "models": [
    {
      "name": "qwen2.5:7b",
      "size": 4700000000,
      "modified_at": "2026-03-08T18:00:00.000Z"
    }
  ]
}

测试 2：生成文本

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:7b",
  "prompt": "你好，请用一句话介绍你自己",
  "stream": false
}'

返回：

{
  "model": "qwen2.5:7b",
  "response": "你好！我是一个人工智能助手，可以帮你回答问题、写作、编程等。",
  "done": true
}

测试 3：流式输出（实时显示）

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:7b",
  "prompt": "请用 Python 写一个 Hello World",
  "stream": true
}'

返回（多行 JSON，每行一个片段）：

{"model":"qwen2.5:7b","response":"```python","done":false}
{"model":"qwen2.5:7b","response":"print","done":false}
{"model":"qwen2.5:7b","response":"(","done":false}
...
{"model":"qwen2.5:7b","response":"","done":true}

💡 提示： stream: true 适合聊天界面，可以实时显示回答；stream: false 适合一次性获取完整结果。

二、Python 调用示例

环境准备

# 安装 requests 库
pip install requests

示例 1：基础调用（同步）

import requests
import json

def ask_ollama(prompt, model="qwen2.5:7b"):
    """
    向 Ollama 发送问题并获取回答
    """
    url = "http://localhost:11434/api/generate"
    
    payload = {
        "model": model,
        "prompt": prompt,
        "stream": False  # 不流式，等待完整结果
    }
    
    response = requests.post(url, json=payload)
    result = response.json()
    
    return result["response"]

# 测试
if __name__ == "__main__":
    answer = ask_ollama("你好，请用一句话介绍你自己")
    print(answer)

运行结果：

你好！我是一个人工智能助手，可以帮你回答问题、写作、编程等。有什么我可以帮你的吗？

示例 2：流式输出（实时显示）

import requests
import json

def ask_ollama_stream(prompt, model="qwen2.5:7b"):
    """
    流式输出，实时显示回答
    """
    url = "http://localhost:11434/api/generate"
    
    payload = {
        "model": model,
        "prompt": prompt,
        "stream": True
    }
    
    response = requests.post(url, json=payload, stream=True)
    
    for line in response.iter_lines():
        if line:
            result = json.loads(line)
            # 实时打印每个片段
            print(result.get("response", ""), end="", flush=True)
            
            # 如果完成，打印统计信息
            if result.get("done", False):
                print("\n")
                print(f"生成时间：{result.get('total_duration', 0) / 1e9:.2f}秒")
                print(f"输出 token 数：{result.get('eval_count', 0)}")

# 测试
if __name__ == "__main__":
    ask_ollama_stream("请用 Python 写一个快速排序算法")

运行效果：

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)

生成时间：3.45 秒
输出 token 数：128

示例 3：多轮对话（聊天机器人）

import requests
import json

class OllamaChat:
    """
    多轮对话聊天机器人
    """
    
    def __init__(self, model="qwen2.5:7b"):
        self.model = model
        self.history = []  # 对话历史
    
    def chat(self, user_input):
        """
        发送消息并获取回复
        """
        url = "http://localhost:11434/api/chat"
        
        # 添加用户消息到历史
        self.history.append({
            "role": "user",
            "content": user_input
        })
        
        payload = {
            "model": self.model,
            "messages": self.history,
            "stream": False
        }
        
        response = requests.post(url, json=payload)
        result = response.json()
        
        # 获取 AI 回复
        ai_reply = result["message"]["content"]
        
        # 添加 AI 回复到历史
        self.history.append({
            "role": "assistant",
            "content": ai_reply
        })
        
        return ai_reply
    
    def clear_history(self):
        """
        清空对话历史
        """
        self.history = []
        print("对话历史已清空")

# 测试
if __name__ == "__main__":
    chat = OllamaChat()
    
    print("🤖 聊天机器人已启动！输入 'quit' 退出，'clear' 清空历史\n")
    
    while True:
        user_input = input("你：")
        
        if user_input.lower() == "quit":
            print("再见！")
            break
        elif user_input.lower() == "clear":
            chat.clear_history()
            continue
        
        ai_reply = chat.chat(user_input)
        print(f"AI：{ai_reply}\n")

运行效果：

🤖 聊天机器人已启动！输入 'quit' 退出，'clear' 清空历史

你：你好
AI：你好！有什么我可以帮助你的吗？

你：我想学习 Python，有什么建议吗？
AI：学习 Python 是个很好的选择！以下是一些建议：
1. 从基础语法开始...

你：clear
对话历史已清空

你：quit
再见！

示例 4：带系统提示词的对话

import requests

def ask_with_system(prompt, system_prompt, model="qwen2.5:7b"):
    """
    带系统提示词的对话
    """
    url = "http://localhost:11434/api/chat"
    
    payload = {
        "model": model,
        "messages": [
            {
                "role": "system",
                "content": system_prompt
            },
            {
                "role": "user",
                "content": prompt
            }
        ],
        "stream": False
    }
    
    response = requests.post(url, json=payload)
    result = response.json()
    
    return result["message"]["content"]

# 测试：代码审查助手
if __name__ == "__main__":
    code = """
def calculate_sum(numbers):
    total = 0
    for i in range(len(numbers)):
        total += numbers[i]
    return total
"""
    
    system_prompt = """你是一个专业的代码审查助手。
请审查代码，指出问题并提供改进建议。"""
    
    review = ask_with_system(
        f"请审查以下代码：\n{code}",
        system_prompt
    )
    
    print(review)

运行结果：

代码审查结果：

这段代码可以优化：

1. 可以使用 enumerate() 替代 range(len())
2. 可以使用 sum() 内置函数简化

优化版本：
def calculate_sum(numbers):
    return sum(numbers)

三、Node.js 调用示例

环境准备

# 创建项目
mkdir ollama-node-demo
cd ollama-node-demo
npm init -y

# 安装 axios
npm install axios

示例 1：基础调用

// index.js
const axios = require('axios');

async function askOllama(prompt, model = 'qwen2.5:7b') {
    const url = 'http://localhost:11434/api/generate';
    
    const payload = {
        model: model,
        prompt: prompt,
        stream: false
    };
    
    const response = await axios.post(url, payload);
    return response.data.response;
}

// 测试
(async () => {
    const answer = await askOllama('你好，请用一句话介绍你自己');
    console.log(answer);
})();

示例 2：流式输出

// stream.js
const axios = require('axios');

async function askOllamaStream(prompt, model = 'qwen2.5:7b') {
    const url = 'http://localhost:11434/api/generate';
    
    const payload = {
        model: model,
        prompt: prompt,
        stream: true
    };
    
    const response = await axios.post(url, payload, {
        responseType: 'stream'
    });
    
    return new Promise((resolve, reject) => {
        let fullResponse = '';
        
        response.data.on('data', (chunk) => {
            const lines = chunk.toString().split('\n');
            
            for (const line of lines) {
                if (line.trim()) {
                    try {
                        const data = JSON.parse(line);
                        process.stdout.write(data.response || '');
                        
                        if (data.done) {
                            console.log('\n');
                            resolve(fullResponse);
                        }
                    } catch (e) {
                        // 忽略解析错误
                    }
                }
            }
        });
        
        response.data.on('error', reject);
    });
}

// 测试
(async () => {
    await askOllamaStream('请用 JavaScript 写一个快速排序算法');
})();

四、错误处理

常见错误及解决方法

错误	原因	解决方法
Connection Refused	Ollama 服务未启动	运行 `ollama serve`
Model Not Found	模型未下载	运行 `ollama pull <模型>`
Out of Memory	内存不足	关闭其他程序或使用更小的模型
Context Too Long	输入超出上下文限制	减少输入长度

完整的错误处理示例（Python）

import requests
import time
from requests.exceptions import ConnectionError, Timeout

class OllamaClient:
    """
    带错误处理的 Ollama 客户端
    """
    
    def __init__(self, model="qwen2.5:7b", base_url="http://localhost:11434"):
        self.model = model
        self.base_url = base_url
        self.max_retries = 3
        self.retry_delay = 1  # 秒
    
    def ask(self, prompt, stream=False):
        """
        发送问题，带重试机制
        """
        url = f"{self.base_url}/api/generate"
        
        payload = {
            "model": self.model,
            "prompt": prompt,
            "stream": stream
        }
        
        for attempt in range(self.max_retries):
            try:
                response = requests.post(url, json=payload, timeout=60)
                
                # 检查 HTTP 状态码
                if response.status_code == 404:
                    raise Exception(f"模型 '{self.model}' 未找到")
                elif response.status_code != 200:
                    raise Exception(f"HTTP 错误：{response.status_code}")
                
                if stream:
                    return self._parse_stream(response)
                else:
                    return response.json()["response"]
                    
            except ConnectionError:
                if attempt == self.max_retries - 1:
                    raise Exception("无法连接到 Ollama 服务")
                print(f"连接失败，{self.retry_delay}秒后重试...")
                time.sleep(self.retry_delay)
                
            except Timeout:
                if attempt == self.max_retries - 1:
                    raise Exception("请求超时")
                print(f"请求超时，{self.retry_delay}秒后重试...")
                time.sleep(self.retry_delay)
        
        return None
    
    def _parse_stream(self, response):
        """
        解析流式响应
        """
        import json
        full_response = ""
        
        for line in response.iter_lines():
            if line:
                try:
                    data = json.loads(line)
                    full_response += data.get("response", "")
                    
                    if data.get("done", False):
                        return full_response
                except json.JSONDecodeError:
                    continue
        
        return full_response
    
    def check_health(self):
        """
        检查 Ollama 服务是否可用
        """
        try:
            response = requests.get(f"{self.base_url}/api/tags", timeout=5)
            if response.status_code == 200:
                models = response.json().get("models", [])
                print(f"✅ Ollama 服务正常，已安装 {len(models)} 个模型")
                return True
            else:
                print(f"❌ HTTP 错误：{response.status_code}")
                return False
        except Exception as e:
            print(f"❌ 无法连接到 Ollama 服务：{e}")
            return False

# 测试
if __name__ == "__main__":
    client = OllamaClient()
    
    # 检查服务状态
    if not client.check_health():
        print("\n请先启动 Ollama 服务：ollama serve")
        exit(1)
    
    # 发送问题
    try:
        answer = client.ask("你好，请用一句话介绍你自己")
        print(f"\nAI：{answer}")
    except Exception as e:
        print(f"错误：{e}")

运行结果：

✅ Ollama 服务正常，已安装 1 个模型

AI：你好！我是一个人工智能助手，可以帮你回答问题、写作、编程等。

五、实战项目：命令行聊天工具

完整代码

# ollama-cli.py
import requests
import json
from datetime import datetime

class OllamaCLI:
    """
    命令行聊天工具
    """
    
    def __init__(self, model="qwen2.5:7b"):
        self.model = model
        self.base_url = "http://localhost:11434"
        self.history = []
        self.stats = {
            "total_requests": 0,
            "total_tokens": 0,
            "start_time": datetime.now()
        }
    
    def chat(self, user_input):
        """发送消息并获取回复"""
        url = f"{self.base_url}/api/chat"
        
        self.history.append({
            "role": "user",
            "content": user_input
        })
        
        payload = {
            "model": self.model,
            "messages": self.history,
            "stream": False
        }
        
        try:
            response = requests.post(url, json=payload, timeout=120)
            result = response.json()
            
            ai_reply = result["message"]["content"]
            
            self.history.append({
                "role": "assistant",
                "content": ai_reply
            })
            
            self.stats["total_requests"] += 1
            self.stats["total_tokens"] += result.get("eval_count", 0)
            
            return ai_reply
            
        except Exception as e:
            return f"❌ 错误：{e}"
    
    def show_stats(self):
        """显示使用统计"""
        duration = datetime.now() - self.stats["start_time"]
        print(f"\n📊 使用统计")
        print(f"   对话轮数：{self.stats['total_requests']}")
        print(f"   生成 token 数：{self.stats['total_tokens']}")
        print(f"   运行时长：{duration}")
        print(f"   当前模型：{self.model}\n")
    
    def export_history(self, filename="chat_history.md"):
        """导出对话历史为 Markdown"""
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(f"# 对话历史\n\n")
            f.write(f"模型：{self.model}\n")
            f.write(f"时间：{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n")
            
            for msg in self.history:
                role = "👤 用户" if msg["role"] == "user" else "🤖 AI"
                f.write(f"### {role}\n\n{msg['content']}\n\n---\n\n")
        
        print(f"✅ 对话历史已导出到：{filename}")
    
    def clear_history(self):
        """清空对话历史"""
        self.history = []
        print("✅ 对话历史已清空\n")

def main():
    """主程序"""
    print("=" * 50)
    print("🦙 Ollama 命令行聊天工具")
    print("=" * 50)
    print(f"当前模型：qwen2.5:7b")
    print("输入 /help 查看命令列表\n")
    
    client = OllamaCLI()
    
    while True:
        try:
            user_input = input("👤 你：").strip()
            
            if not user_input:
                continue
            
            # 处理命令
            if user_input.startswith("/"):
                cmd = user_input.lower()
                
                if cmd in ["/quit", "/exit", "/q"]:
                    print("👋 再见！")
                    break
                elif cmd == "/stats":
                    client.show_stats()
                elif cmd == "/clear":
                    client.clear_history()
                elif cmd == "/export":
                    client.export_history()
                elif cmd.startswith("/model"):
                    parts = user_input.split(maxsplit=1)
                    if len(parts) == 2:
                        client.model = parts[1]
                        print(f"✅ 已切换到模型：{client.model}\n")
                    else:
                        print(f"当前模型：{client.model}\n")
                else:
                    print(f"❌ 未知命令：{user_input}\n")
            else:
                # 普通对话
                ai_reply = client.chat(user_input)
                print(f"\n🤖 AI：{ai_reply}\n")
                
        except KeyboardInterrupt:
            print("\n👋 再见！")
            break

if __name__ == "__main__":
    main()

使用方法

# 运行
python ollama-cli.py

# 对话示例
👤 你：你好
🤖 AI：你好！有什么我可以帮助你的吗？

👤 你：/stats
📊 使用统计
   对话轮数：1
   生成 token 数：32
   运行时长：0:01:23

👤 你：/export
✅ 对话历史已导出到：chat_history.md

👤 你：/quit
👋 再见！

六、本章小结

恭喜！你已经学会了：

✅ Ollama API 基础使用
✅ Python 调用示例（同步/流式/多轮对话）
✅ Node.js 调用示例
✅ 错误处理与重试机制
✅ 实战项目：命令行聊天工具

七、课后作业

基础练习： 用 Python 或 Node.js 写一个简单的问答程序
进阶练习： 实现一个带历史记录的聊天机器人
挑战练习： 添加导出功能，支持 Markdown/PDF 格式

完成后可以在评论区分享你的作品！ 💬

关于作者

我是谢轩，一个喜欢折腾 AI 的打工人。

最近在做一个 Ollama 的增强工具 Ollama Tools，有 15+ Prompt 模板、对话历史管理、多模型对比等功能。

GitHub: github.com/954215110/o…
在线演示: 12wc0wo892531.vicp.fun

觉得有用的话，点个 Star 支持一下！ ⭐

系列文章

第一章：30 分钟搭建你的第一个本地 AI 助手
第二章：Ollama 模型选择指南
第三章：Ollama API 调用全攻略（本文）
第四章：Web UI 开发（计划中）

想第一时间看到更新？关注我！ 🔔

付费课程预告

🎉 《Ollama 实战教程》付费课程即将上线！

档位	价格	内容
早鸟价	¥29	完整课程 + 源码 + 答疑群
正式价	¥49	全部内容 + 专属社群
VIP 版	¥99	全部内容 + 1v1 咨询

早鸟名额限前 50 名！

购买方式：

爱发电：afdian.net/@谢轩（认证中，即将上…
微信/支付宝：联系作者（954215110@qq.com）

标签： #AI #大模型 #Ollama #Python #Node.js #API #教程


---

## 📤 发布步骤

1. 访问 https://juejin.cn/
2. 点"创作中心" → "创作文章"
3. 选一个标题
4. 复制上面正文内容粘贴
5. 选个封面图
6. 标签选：AI、大模型、Ollama、Python、Node.js、API、教程
7. **记得链接到第一章和第二章**（已放在文章开头）
8. 点"发布"

---

--

## �� 支持作者

如果这个系列对你有帮助，可以选择以下方式支持：

### 1️⃣ 微信/支付宝 赞赏

![微信赞赏码]

![]() 

![支付宝赞赏码]

![]() 

*金额不限，感谢支持！每一笔赞助都是我持续创作的动力！*

### 2️⃣ 商务合作

**技术服务：**

- �� **Ollama 部署咨询**：¥500/小时

- �� **企业定制开发**：面议（根据需求报价）

- �� **远程技术支持**：¥300/小时

**联系方式：**

- �� 微信：（你的微信号）

- �� 邮箱：（你的邮箱）

### 3️⃣ GitHub 开源项目

**项目地址：** https://github.com/954215110/954215110ollama-tools

如果对你有帮助，别忘了点个 ⭐ **Star** 支持！

---

## �� Ollama Tools 系列教程

| 章节 | 标题 | 链接 |

|------|------|------|

| 第 1 章 | Ollama 入门 | https://juejin.cn/spost/7614644305726423075 |

| 第 2 章 | API 详解 | https://juejin.cn/spost/7614110147108716584 |

| 第 3 章 | API 调用实战 | https://juejin.cn/post/7614747451153727540 |

| 第 4 章 | Web UI 开发 | https://juejin.cn/post/7614451900677849103 |

| 第 5 章 | 高级功能开发 | https://juejin.cn/post/7614708335367815183 |

| 第 6 章 | 部署与优化 | https://juejin.cn/post/7614884374551756835 |

---

**你的支持是我持续创作的动力！感谢阅读！��**

🌟 付费社群：Ollama Tools 实战圈

加入方式： 扫描下方二维码

星球福利：

✅ 完整源码下载（含 Web UI）
✅ 一对一部署答疑
✅ 最新 AI 工具分享
✅ 同行交流 + 内推机会
✅ 后续教程优先观看

定价： ¥199/年（早鸟价）

（上传你的知识星球二维码图片）

星球链接：wx.zsxq.com/group/48885…