状态码	含义	常见原因
200	成功	请求正常处理
400	请求错误	参数格式不对、缺少必需参数
404	未找到	模型不存在、接口路径错误
500	服务器错误	Ollama 内部错误

400 错误示例

参数格式错误：

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2"
}'

{
  "error": "prompt is required"
}

缺少必需的 prompt 参数。

404 错误示例

模型不存在：

curl http://localhost:11434/api/generate -d '{
  "model": "not-exist-model",
  "prompt": "你好"
}'

{
  "error": "model 'not-exist-model' not found"
}

500 错误示例

服务器内部错误：

{
  "error": "an error was encountered while running the model"
}

可能是显存不足、模型文件损坏等原因。

错误响应格式

所有错误响应都是 JSON 格式：

{
  "error": "错误描述信息"
}

这个格式很简单，只有一个 error 字段。

常见错误类型

模型相关错误

模型不存在

{
  "error": "model 'xxx' not found, try pulling it first"
}

解决方法：

ollama pull xxx

模型加载失败

{
  "error": "failed to load model"
}

可能原因：

显存不足
模型文件损坏
系统内存不足

模型正在使用

某些操作需要模型空闲：

{
  "error": "model is currently in use"
}

参数相关错误

缺少必需参数

{
  "error": "prompt is required"
}

参数类型错误

{
  "error": "invalid parameter type"
}

参数值无效

{
  "error": "temperature must be between 0 and 2"
}

资源相关错误

显存不足

{
  "error": "CUDA out of memory"
}

或者：

{
  "error": "not enough memory to load model"
}

磁盘空间不足

{
  "error": "no space left on device"
}

连接相关错误

服务未启动

curl: (7) Failed to connect to localhost port 11434

连接超时

curl: (28) Connection timed out

错误处理最佳实践

Python 示例

import requests
from requests.exceptions import RequestException

def call_ollama(prompt, model="llama3.2"):
    try:
        response = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": model, "prompt": prompt, "stream": False},
            timeout=60
        )
        
        if response.status_code == 200:
            return response.json()["response"]
        elif response.status_code == 400:
            error = response.json().get("error", "Unknown error")
            print(f"请求参数错误: {error}")
        elif response.status_code == 404:
            error = response.json().get("error", "Unknown error")
            print(f"资源未找到: {error}")
            if "not found" in error:
                print(f"尝试拉取模型: ollama pull {model}")
        elif response.status_code == 500:
            print("服务器内部错误，请检查 Ollama 日志")
        else:
            print(f"未知错误: {response.status_code}")
            
        return None
        
    except RequestException as e:
        print(f"网络请求异常: {e}")
        return None

result = call_ollama("你好")
if result:
    print(result)

JavaScript 示例

async function callOllama(prompt, model = 'llama3.2') {
    try {
        const controller = new AbortController();
        const timeoutId = setTimeout(() => controller.abort(), 60000);
        
        const response = await fetch('http://localhost:11434/api/generate', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ model, prompt, stream: false }),
            signal: controller.signal
        });
        
        clearTimeout(timeoutId);
        
        if (!response.ok) {
            const error = await response.json();
            
            switch (response.status) {
                case 400:
                    console.error('请求参数错误:', error.error);
                    break;
                case 404:
                    console.error('资源未找到:', error.error);
                    break;
                case 500:
                    console.error('服务器内部错误');
                    break;
                default:
                    console.error(`HTTP ${response.status}:`, error.error);
            }
            return null;
        }
        
        const data = await response.json();
        return data.response;
        
    } catch (error) {
        if (error.name === 'AbortError') {
            console.error('请求超时');
        } else {
            console.error('网络请求异常:', error.message);
        }
        return null;
    }
}

Go 示例

package main

import (
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "time"
)

type ErrorResponse struct {
    Error string `json:"error"`
}

func callOllama(prompt string) (string, error) {
    client := &http.Client{Timeout: 60 * time.Second}
    
    resp, err := client.Post(
        "http://localhost:11434/api/generate",
        "application/json",
        bytes.NewReader(body),
    )
    if err != nil {
        return "", fmt.Errorf("网络请求失败: %w", err)
    }
    defer resp.Body.Close()
    
    body, _ := io.ReadAll(resp.Body)
    
    if resp.StatusCode != 200 {
        var errResp ErrorResponse
        json.Unmarshal(body, &errResp)
        
        switch resp.StatusCode {
        case 400:
            return "", fmt.Errorf("请求参数错误: %s", errResp.Error)
        case 404:
            return "", fmt.Errorf("资源未找到: %s", errResp.Error)
        case 500:
            return "", fmt.Errorf("服务器内部错误: %s", errResp.Error)
        default:
            return "", fmt.Errorf("HTTP %d: %s", resp.StatusCode, errResp.Error)
        }
    }
    
    var result GenerateResponse
    json.Unmarshal(body, &result)
    return result.Response, nil
}

重试策略

对于临时性错误，可以尝试重试：

import time
import requests

def call_with_retry(prompt, max_retries=3, retry_delay=2):
    for attempt in range(max_retries):
        try:
            response = requests.post(
                "http://localhost:11434/api/generate",
                json={"model": "llama3.2", "prompt": prompt, "stream": False},
                timeout=60
            )
            
            if response.status_code == 200:
                return response.json()["response"]
            
            if response.status_code >= 500:
                print(f"服务器错误，第 {attempt + 1} 次重试...")
                time.sleep(retry_delay * (attempt + 1))
                continue
                
            return None
            
        except requests.exceptions.RequestException as e:
            print(f"请求异常，第 {attempt + 1} 次重试...")
            time.sleep(retry_delay * (attempt + 1))
    
    print("重试次数已用尽")
    return None

日志调试

遇到问题时，可以查看 Ollama 服务日志：

# Linux/macOS
journalctl -u ollama -f

# 或者直接运行查看输出
ollama serve

日志会显示详细的错误信息，帮助定位问题。

上一章：基础用法

下一章： OpenAI 兼容性