from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

message = client.messages.create(
    model="llama3.2",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "你好"}
    ]
)

print(message.content[0].text)

流式输出

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

with client.messages.stream(
    model="llama3.2",
    max_tokens=1024,
    messages=[{"role": "user", "content": "写一首诗"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

消息格式

Anthropic 的消息格式和 OpenAI 略有不同：

基本格式

messages = [
    {"role": "user", "content": "你好"}
]

多部分内容

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "这张图片里有什么？"},
            {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_base64}}
        ]
    }
]

System Prompt

Anthropic 使用单独的 system 参数：

message = client.messages.create(
    model="llama3.2",
    max_tokens=1024,
    system="你是一个友好的助手",
    messages=[
        {"role": "user", "content": "你好"}
    ]
)

支持的参数

参数	说明
model	模型名称
max_tokens	最大生成 token 数
system	系统提示
messages	消息列表
temperature	随机性
top_p	核采样
top_k	候选词数
stop_sequences	停止序列

Tool Use

Anthropic 的工具调用格式：

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

tools = [
    {
        "name": "get_weather",
        "description": "获取指定城市的天气",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "城市名称"
                }
            },
            "required": ["city"]
        }
    }
]

response = client.messages.create(
    model="llama3.2",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "北京今天天气怎么样？"}]
)

for block in response.content:
    if block.type == "tool_use":
        print(f"调用工具: {block.name}")
        print(f"参数: {block.input}")

多模态支持

使用 llava 等多模态模型：

import base64
from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

with open("image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

message = client.messages.create(
    model="llava",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "描述这张图片"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                }
            ]
        }
    ]
)

print(message.content[0].text)

兼容性差异

模型映射

Anthropic 用 claude-3-opus 这样的名称，Ollama 用自己的模型名：

# Anthropic
model="claude-3-opus-20240229"

# Ollama
model="llama3.2"

响应格式

# 响应对象
message.id          # 消息 ID
message.model       # 模型名称
message.role        # 角色
message.content     # 内容列表
message.stop_reason # 停止原因
message.usage       # token 使用统计

不支持的特性

thinking 模式（扩展思考）
citations（引用）
cache_control（缓存控制）

实际应用示例

构建对话系统

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

def chat(messages, user_input):
    messages.append({"role": "user", "content": user_input})
    
    response = client.messages.create(
        model="llama3.2",
        max_tokens=1024,
        system="你是一个友好的助手",
        messages=messages
    )
    
    assistant_message = response.content[0].text
    messages.append({"role": "assistant", "content": assistant_message})
    
    return assistant_message

conversation = []

while True:
    user_input = input("你: ")
    if user_input.lower() in ["exit", "quit"]:
        break
    
    response = chat(conversation, user_input)
    print(f"助手: {response}")

流式对话

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:11434/v1",
    api_key="ollama"
)

def stream_chat(messages, user_input):
    messages.append({"role": "user", "content": user_input})
    
    print("助手: ", end="", flush=True)
    
    full_response = ""
    with client.messages.stream(
        model="llama3.2",
        max_tokens=1024,
        messages=messages
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
            full_response += text
    
    print()
    messages.append({"role": "assistant", "content": full_response})

conversation = []

while True:
    user_input = input("你: ")
    if user_input.lower() in ["exit", "quit"]:
        break
    
    stream_chat(conversation, user_input)

选择哪个兼容层？

场景	推荐
已有 OpenAI 代码	OpenAI 兼容
已有 Anthropic 代码	Anthropic 兼容
新项目	OpenAI 兼容（生态更丰富）
需要工具调用	两者都支持
多模态	两者都支持

小结

Ollama 的兼容层让你可以：

用熟悉的 SDK 调用本地模型
无缝迁移现有代码
与各种 AI 框架集成

下一部分开始，我们详细讲解各个 API 端点的使用方法。

上一章： OpenAI 兼容性

下一章：API 端点详解