调用 API 不可能每次都成功,这一章我们来看看 API 会返回哪些错误,以及怎么处理。
Ollama API 使用标准的 HTTP 状态码:
| 状态码 | 含义 | 常见原因 |
|---|---|---|
| 200 | 成功 | 请求正常处理 |
| 400 | 请求错误 | 参数格式不对、缺少必需参数 |
| 404 | 未找到 | 模型不存在、接口路径错误 |
| 500 | 服务器错误 | Ollama 内部错误 |
参数格式错误:
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2"
}'
返回:
{
"error": "prompt is required"
}
缺少必需的 prompt 参数。
模型不存在:
curl http://localhost:11434/api/generate -d '{
"model": "not-exist-model",
"prompt": "你好"
}'
返回:
{
"error": "model 'not-exist-model' not found"
}
服务器内部错误:
{
"error": "an error was encountered while running the model"
}
可能是显存不足、模型文件损坏等原因。
所有错误响应都是 JSON 格式:
{
"error": "错误描述信息"
}
这个格式很简单,只有一个 error 字段。
模型不存在
{
"error": "model 'xxx' not found, try pulling it first"
}
解决方法:
ollama pull xxx
模型加载失败
{
"error": "failed to load model"
}
可能原因:
模型正在使用
某些操作需要模型空闲:
{
"error": "model is currently in use"
}
缺少必需参数
{
"error": "prompt is required"
}
参数类型错误
{
"error": "invalid parameter type"
}
参数值无效
{
"error": "temperature must be between 0 and 2"
}
显存不足
{
"error": "CUDA out of memory"
}
或者:
{
"error": "not enough memory to load model"
}
磁盘空间不足
{
"error": "no space left on device"
}
服务未启动
curl: (7) Failed to connect to localhost port 11434
连接超时
curl: (28) Connection timed out
import requests
from requests.exceptions import RequestException
def call_ollama(prompt, model="llama3.2"):
try:
response = requests.post(
"http://localhost:11434/api/generate",
json={"model": model, "prompt": prompt, "stream": False},
timeout=60
)
if response.status_code == 200:
return response.json()["response"]
elif response.status_code == 400:
error = response.json().get("error", "Unknown error")
print(f"请求参数错误: {error}")
elif response.status_code == 404:
error = response.json().get("error", "Unknown error")
print(f"资源未找到: {error}")
if "not found" in error:
print(f"尝试拉取模型: ollama pull {model}")
elif response.status_code == 500:
print("服务器内部错误,请检查 Ollama 日志")
else:
print(f"未知错误: {response.status_code}")
return None
except RequestException as e:
print(f"网络请求异常: {e}")
return None
result = call_ollama("你好")
if result:
print(result)
async function callOllama(prompt, model = 'llama3.2') {
try {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 60000);
const response = await fetch('http://localhost:11434/api/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ model, prompt, stream: false }),
signal: controller.signal
});
clearTimeout(timeoutId);
if (!response.ok) {
const error = await response.json();
switch (response.status) {
case 400:
console.error('请求参数错误:', error.error);
break;
case 404:
console.error('资源未找到:', error.error);
break;
case 500:
console.error('服务器内部错误');
break;
default:
console.error(`HTTP ${response.status}:`, error.error);
}
return null;
}
const data = await response.json();
return data.response;
} catch (error) {
if (error.name === 'AbortError') {
console.error('请求超时');
} else {
console.error('网络请求异常:', error.message);
}
return null;
}
}
package main
import (
"encoding/json"
"fmt"
"io"
"net/http"
"time"
)
type ErrorResponse struct {
Error string `json:"error"`
}
func callOllama(prompt string) (string, error) {
client := &http.Client{Timeout: 60 * time.Second}
resp, err := client.Post(
"http://localhost:11434/api/generate",
"application/json",
bytes.NewReader(body),
)
if err != nil {
return "", fmt.Errorf("网络请求失败: %w", err)
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
if resp.StatusCode != 200 {
var errResp ErrorResponse
json.Unmarshal(body, &errResp)
switch resp.StatusCode {
case 400:
return "", fmt.Errorf("请求参数错误: %s", errResp.Error)
case 404:
return "", fmt.Errorf("资源未找到: %s", errResp.Error)
case 500:
return "", fmt.Errorf("服务器内部错误: %s", errResp.Error)
default:
return "", fmt.Errorf("HTTP %d: %s", resp.StatusCode, errResp.Error)
}
}
var result GenerateResponse
json.Unmarshal(body, &result)
return result.Response, nil
}
对于临时性错误,可以尝试重试:
import time
import requests
def call_with_retry(prompt, max_retries=3, retry_delay=2):
for attempt in range(max_retries):
try:
response = requests.post(
"http://localhost:11434/api/generate",
json={"model": "llama3.2", "prompt": prompt, "stream": False},
timeout=60
)
if response.status_code == 200:
return response.json()["response"]
if response.status_code >= 500:
print(f"服务器错误,第 {attempt + 1} 次重试...")
time.sleep(retry_delay * (attempt + 1))
continue
return None
except requests.exceptions.RequestException as e:
print(f"请求异常,第 {attempt + 1} 次重试...")
time.sleep(retry_delay * (attempt + 1))
print("重试次数已用尽")
return None
遇到问题时,可以查看 Ollama 服务日志:
# Linux/macOS
journalctl -u ollama -f
# 或者直接运行查看输出
ollama serve
日志会显示详细的错误信息,帮助定位问题。