聊天接口 (POST /api/chat)

聊天接口专门为对话场景设计,支持多轮对话、角色区分,比生成接口更适合构建聊天应用。

基本用法

最简单的聊天请求:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "user", "content": "你好"}
  ]
}'

响应:

{
  "model": "llama3.2",
  "created_at": "2024-01-15T10:00:00Z",
  "message": {
    "role": "assistant",
    "content": "你好!有什么可以帮你的吗?"
  },
  "done": true
}

消息格式

每条消息包含 role 和 content:

{
  "role": "user",
  "content": "消息内容"
}

角色类型

角色说明
system系统提示,定义 AI 行为
user用户消息
assistantAI 回复

system 消息

放在消息列表开头,定义 AI 的行为:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "system", "content": "你是一个专业的 Python 开发者,回答简洁准确"},
    {"role": "user", "content": "什么是装饰器?"}
  ]
}'

也可以使用单独的 system 参数:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "system": "你是一个专业的 Python 开发者",
  "messages": [
    {"role": "user", "content": "什么是装饰器?"}
  ]
}'

多轮对话

聊天接口天然支持多轮对话,把历史消息传入即可:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "user", "content": "我叫小明"},
    {"role": "assistant", "content": "你好小明!很高兴认识你。"},
    {"role": "user", "content": "我叫什么名字?"}
  ]
}'

响应:

{
  "message": {
    "role": "assistant",
    "content": "你叫小明。"
  }
}

请求参数

参数类型必需说明
modelstring模型名称
messagesarray消息列表
streambool是否流式,默认 true
formatstring输出格式
optionsobject模型参数
systemstring系统提示
keep_alivestring模型保留时间
toolsarray工具定义

options 参数

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [{"role": "user", "content": "写一首诗"}],
  "options": {
    "temperature": 0.8,
    "num_ctx": 4096,
    "top_p": 0.9
  }
}'

format 参数

强制 JSON 输出:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "user", "content": "生成一个包含姓名和年龄的用户信息"}
  ],
  "format": "json",
  "stream": false
}'

流式响应

默认是流式输出:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [{"role": "user", "content": "写一首诗"}]
}'

返回多个 JSON 对象:

{"model":"llama3.2","created_at":"2024-01-15T10:00:00Z","message":{"role":"assistant","content":"春"},"done":false}
{"model":"llama3.2","created_at":"2024-01-15T10:00:00Z","message":{"role":"assistant","content":"风"},"done":false}
...
{"model":"llama3.2","created_at":"2024-01-15T10:00:00Z","message":{"role":"assistant","content":""},"done":true,"total_duration":5000000000}

响应字段

字段说明
model模型名称
created_at创建时间
message消息对象
message.role角色(assistant)
message.content内容
done是否完成
total_duration总耗时
eval_count生成 token 数

图片消息

支持多模态模型(如 llava):

curl http://localhost:11434/api/chat -d '{
  "model": "llava",
  "messages": [
    {
      "role": "user",
      "content": "这张图片里有什么?",
      "images": ["iVBORw0KGgoAAAANSUhEUgAA..."]
    }
  ]
}'

图片需要 Base64 编码。

Python 示例

import base64
import requests

def chat_with_image(image_path, question, model="llava"):
    with open(image_path, "rb") as f:
        image_data = base64.b64encode(f.read()).decode()
    
    response = requests.post(
        "http://localhost:11434/api/chat",
        json={
            "model": model,
            "messages": [
                {
                    "role": "user",
                    "content": question,
                    "images": [image_data]
                }
            ],
            "stream": False
        }
    )
    
    return response.json()["message"]["content"]

result = chat_with_image("photo.jpg", "描述这张图片")
print(result)

工具调用

聊天接口支持 Function Calling:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "user", "content": "北京今天天气怎么样?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "获取城市天气",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string", "description": "城市名称"}
          },
          "required": ["city"]
        }
      }
    }
  ]
}'

响应:

{
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_weather",
          "arguments": {"city": "北京"}
        }
      }
    ]
  }
}

代码示例

Python 对话类

import requests

class ChatSession:
    def __init__(self, model="llama3.2", system=None):
        self.model = model
        self.messages = []
        if system:
            self.messages.append({"role": "system", "content": system})
    
    def send(self, content, stream=False):
        self.messages.append({"role": "user", "content": content})
        
        response = requests.post(
            "http://localhost:11434/api/chat",
            json={
                "model": self.model,
                "messages": self.messages,
                "stream": stream
            },
            stream=stream
        )
        
        if stream:
            full_response = ""
            for line in response.iter_lines():
                if line:
                    import json
                    data = json.loads(line)
                    if data.get("message", {}).get("content"):
                        text = data["message"]["content"]
                        full_response += text
                        yield text
            self.messages.append({"role": "assistant", "content": full_response})
        else:
            data = response.json()
            self.messages.append({"role": "assistant", "content": data["message"]["content"]})
            return data["message"]["content"]

# 使用
chat = ChatSession(system="你是一个友好的助手")

# 非流式
reply = chat.send("你好")
print(reply)

# 流式
for text in chat.send("写一首诗", stream=True):
    print(text, end="", flush=True)

JavaScript 对话类

class ChatSession {
    constructor(model = 'llama3.2', system = null) {
        this.model = model;
        this.messages = [];
        if (system) {
            this.messages.push({ role: 'system', content: system });
        }
    }
    
    async send(content) {
        this.messages.push({ role: 'user', content });
        
        const response = await fetch('http://localhost:11434/api/chat', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                model: this.model,
                messages: this.messages,
                stream: false
            })
        });
        
        const data = await response.json();
        this.messages.push({ role: 'assistant', content: data.message.content });
        return data.message.content;
    }
    
    async *sendStream(content) {
        this.messages.push({ role: 'user', content });
        
        const response = await fetch('http://localhost:11434/api/chat', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                model: this.model,
                messages: this.messages,
                stream: true
            })
        });
        
        const reader = response.body.getReader();
        const decoder = new TextDecoder();
        let fullResponse = '';
        
        while (true) {
            const { done, value } = await reader.read();
            if (done) break;
            
            const lines = decoder.decode(value).split('\n').filter(Boolean);
            for (const line of lines) {
                const data = JSON.parse(line);
                if (data.message?.content) {
                    fullResponse += data.message.content;
                    yield data.message.content;
                }
            }
        }
        
        this.messages.push({ role: 'assistant', content: fullResponse });
    }
}

// 使用
const chat = new ChatSession('llama3.2', '你是一个友好的助手');
const reply = await chat.send('你好');
console.log(reply);

Go 对话实现

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

type Message struct {
    Role    string `json:"role"`
    Content string `json:"content"`
}

type ChatRequest struct {
    Model    string    `json:"model"`
    Messages []Message `json:"messages"`
    Stream   bool      `json:"stream"`
}

type ChatResponse struct {
    Message Message `json:"message"`
    Done    bool    `json:"done"`
}

type ChatSession struct {
    Model    string
    Messages []Message
}

func NewChatSession(model string, system string) *ChatSession {
    session := &ChatSession{
        Model:    model,
        Messages: []Message{},
    }
    if system != "" {
        session.Messages = append(session.Messages, Message{
            Role: "system", Content: system,
        })
    }
    return session
}

func (s *ChatSession) Send(content string) (string, error) {
    s.Messages = append(s.Messages, Message{
        Role: "user", Content: content,
    })
    
    req := ChatRequest{
        Model:    s.Model,
        Messages: s.Messages,
        Stream:   false,
    }
    
    body, _ := json.Marshal(req)
    resp, err := http.Post(
        "http://localhost:11434/api/chat",
        "application/json",
        bytes.NewReader(body),
    )
    if err != nil {
        return "", err
    }
    defer resp.Body.Close()
    
    data, _ := io.ReadAll(resp.Body)
    var result ChatResponse
    json.Unmarshal(data, &result)
    
    s.Messages = append(s.Messages, result.Message)
    return result.Message.Content, nil
}

func main() {
    chat := NewChatSession("llama3.2", "你是一个友好的助手")
    reply, _ := chat.Send("你好")
    fmt.Println(reply)
}

与生成接口的区别

特性聊天接口生成接口
输入格式messages 数组prompt 字符串
多轮对话原生支持需要 context
角色区分支持不支持
工具调用支持不支持
图片输入支持不支持
使用场景对话应用文本生成