拉取模型 (POST /api/pull)

拉取模型接口用于从 Ollama 模型库下载模型到本地。

基本用法

curl http://localhost:11434/api/pull -d '{
  "name": "llama3.2"
}'

响应(流式):

{"status":"pulling manifest"}
{"status":"downloading digest","digest":"abc123...","total":4661224676,"completed":0}
{"status":"downloading digest","digest":"abc123...","total":4661224676,"completed":1000000000}
...
{"status":"verifying sha256 digest"}
{"status":"writing manifest"}
{"status":"removing any unused layers"}
{"status":"success"}

请求参数

参数类型必需说明
namestring模型名称
insecurebool是否允许不安全连接
streambool是否流式输出,默认 true

指定标签

curl http://localhost:11434/api/pull -d '{
  "name": "llama3.2:3b"
}'

不指定标签时默认使用 latest

响应状态

状态说明
pulling manifest拉取模型清单
downloading digest下载模型文件
verifying sha256 digest验证文件完整性
writing manifest写入模型清单
removing any unused layers清理无用文件
success下载成功

代码示例

Python

import requests

def pull_model(model_name):
    response = requests.post(
        "http://localhost:11434/api/pull",
        json={"name": model_name},
        stream=True
    )
    
    for line in response.iter_lines():
        if line:
            import json
            data = json.loads(line)
            status = data.get("status", "")
            
            if "downloading" in status:
                total = data.get("total", 0)
                completed = data.get("completed", 0)
                if total > 0:
                    percent = (completed / total) * 100
                    print(f"\r下载进度: {percent:.1f}%", end="", flush=True)
            else:
                print(status)
    
    print("\n下载完成")

pull_model("llama3.2")

带进度显示

def pull_with_progress(model_name):
    response = requests.post(
        "http://localhost:11434/api/pull",
        json={"name": model_name},
        stream=True
    )
    
    for line in response.iter_lines():
        if line:
            import json
            data = json.loads(line)
            
            if data.get("status") == "downloading digest":
                total = data.get("total", 0)
                completed = data.get("completed", 0)
                if total > 0:
                    bar_length = 40
                    filled = int(bar_length * completed / total)
                    bar = "█" * filled + "░" * (bar_length - filled)
                    percent = (completed / total) * 100
                    print(f"\r[{bar}] {percent:.1f}%", end="", flush=True)
            elif data.get("status") == "success":
                print("\n✓ 下载完成")
            else:
                print(data.get("status", ""))

pull_with_progress("mistral:7b")

JavaScript

async function pullModel(modelName) {
    const response = await fetch('http://localhost:11434/api/pull', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ name: modelName })
    });
    
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    
    while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        
        const lines = decoder.decode(value).split('\n').filter(Boolean);
        for (const line of lines) {
            const data = JSON.parse(line);
            
            if (data.status === 'downloading digest') {
                const total = data.total || 0;
                const completed = data.completed || 0;
                if (total > 0) {
                    const percent = ((completed / total) * 100).toFixed(1);
                    process.stdout.write(`\r下载进度: ${percent}%`);
                }
            } else {
                console.log(data.status);
            }
        }
    }
    
    console.log('\n下载完成');
}

await pullModel('llama3.2');

Go

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "strings"
)

type PullRequest struct {
    Name string `json:"name"`
}

func pullModel(modelName string) error {
    req := PullRequest{Name: modelName}
    body, _ := json.Marshal(req)
    
    resp, err := http.Post(
        "http://localhost:11434/api/pull",
        "application/json",
        bytes.NewReader(body),
    )
    if err != nil {
        return err
    }
    defer resp.Body.Close()
    
    decoder := json.NewDecoder(resp.Body)
    for {
        var data map[string]interface{}
        if err := decoder.Decode(&data); err != nil {
            if err == io.EOF {
                break
            }
            continue
        }
        
        status, _ := data["status"].(string)
        if strings.Contains(status, "downloading") {
            total, _ := data["total"].(float64)
            completed, _ := data["completed"].(float64)
            if total > 0 {
                percent := (completed / total) * 100
                fmt.Printf("\r下载进度: %.1f%%", percent)
            }
        } else {
            fmt.Println(status)
        }
    }
    
    fmt.Println("\n下载完成")
    return nil
}

func main() {
    pullModel("llama3.2")
}

实际应用

批量拉取模型

def pull_models(model_list):
    for model in model_list:
        print(f"\n正在拉取: {model}")
        pull_model(model)

pull_models(["llama3.2", "mistral:7b", "codellama"])

检查并拉取

def ensure_model(model_name):
    models = requests.get("http://localhost:11434/api/tags").json()["models"]
    
    for model in models:
        if model["name"].startswith(model_name):
            print(f"模型 {model_name} 已存在")
            return
    
    print(f"模型 {model_name} 不存在,正在拉取...")
    pull_model(model_name)

ensure_model("llama3.2")

常用模型

模型大小说明
llama3.22-4 GB最新 Llama 小模型
llama3.1:8b4.7 GBLlama 3.1 8B
mistral:7b4.1 GBMistral 7B
codellama4-7 GB代码专用
llava4.5 GB多模态模型
nomic-embed-text274 MB嵌入模型

注意事项

  1. 网络要求:需要能访问 ollama.ai
  2. 磁盘空间:确保有足够空间
  3. 下载时间:大模型可能需要较长时间
  4. 断点续传:中断后重新拉取会继续