开启左侧

vLLM DeepSeek-V3.1 运用指南

[复制链接]
在线会员 NWI 发表于 2025-8-25 09:23:10 | 显示全部楼层 |阅读模式 打印 上一主题 下一主题 |快速收录
vLLM DeepSeek-V3.1 使用指北


中心重心:DeepSeek-V3.1 是一个撑持思惟情势战非思惟情势切换的混淆模子,原文介绍怎样正在 vLLM 中静态切换那二种情势
简介


DeepSeek-V3.1 是一个混淆模子,撑持思惟情势(thinking mode)战非思惟情势(non-thinking mode)。原指北将具体介绍怎样正在 vLLM 中静态切换那二种情势。
装置 vLLM


起首,创立假造情况并装置 vLLM:
uv venvsource .venv/bin/activateuv pip install -U vllm --torch-backend auto

留神事变:
    保证已经装置 uv 保证理器倡议使用假造情况断绝依靠
启用 DeepSeek-V3.1

正在 8xH200(或者 H20)GPU 上布置(141GB × 8)

vllm serve deepseek-ai/DeepSeek-V3.1 \  --enable-expert-parallel \  --tensor-parallel-size 8 \  --served-model-name ds31

参数分析:
    --enable-expert-parallel:启动大师并止--tensor-parallel-size 8:树立弛质并止巨细为 8--served-model-name ds31:指定效劳模子称呼
模子使用办法

OpenAI 客户端示例


您能够使用 OpenAI 客户端去挪用模子,颠末 extra_body={"chat_template_kwargs": {"thinking": False}} 掌握可否启动思惟情势,此中 True 启动思惟情势,False 禁用思惟情势(非思惟情势)。
from openai import OpenAI
openai_api_key = "EMPTY"openai_api_base = "http://localhost:8000/v1"
client = OpenAI(    api_key=openai_api_key,    base_url=openai_api_base,)
models = client.models.list()model = models.data[0].id
messages = [    {"role": "system", "content": "You are a helpful assistant"},    {"role": "user", "content": "Who are you?"},    {"role": "assistant", "content": "H妹妹 I am DeepSeek"},    {"role": "user", "content": "9.11 and 9.8, which is greater?"},]
# 树立思惟情势extra_body = {"chat_template_kwargs": {"thinking": False}}response = client.chat.completions.create(    model=model, messages=messages, extra_body=extra_body)content = response.choices[0].message.contentprint("content:\n", content)输出示例

thinking=True(启动思惟情势)


当启动思惟情势时,输出成果包罗思考历程:

H妹妹, the user is asking which number is greater between 9.11 and 9.8. This seems straightforward, but I should be careful because decimals can sometimes confuse people.

I recall that comparing decimals involves looking at each digit from left to right. Both numbers have the same whole number part (9), so I need to compare the decimal parts. 0.11 is greater than 0.8 because 0.11 is equivalent to 0.110 and 0.8 is 0.800, so 110 thousandths is greater than 800 thousandths? Wait no, that's wrong.

Actually, 0.8 is the same as 0.80, and 0.11 is less than 0.80. So 9.11 is actually less than 9.8. I should double-check that. Yes, 9.8 is larger because 0.8 > 0.11.

I'll explain it clearly by comparing the tenths place: 9.8 has 8 tenths, while 9.11 has 1 tenth and 1 hundredth, so 8 tenths is indeed larger.

The answer is 9.8 is greater. I'll state it confidently and offer further help if needed. 9.8 is greater than 9.11.

To compare them:
    9.8 is equivalent to 9.809.80 has 8 tenths, while 9.11 has only 1 tenthSince 8 tenths (0.8) is greater than 1 tenth (0.1), 9.8 > 9.11

Let me know if you need further clarification! 😊
thinking=False(禁用思惟情势)


当禁用思惟情势时,输出更间接:

The number 9.11 is greater than 9.8.

To compare them:
    9.11 = 9 + 11/1009.8 = 9 + 80/100

Since 11/100 (0.11) is less than 80/100 (0.80), 9.11 is actually smaller than 9.8. Wait, let me correct that:

Actually, 9.8 is greater than 9.11.
    9.8 can be thought of as 9.80Comparing 9.80 and 9.11: 80 hundredths is greater than 11 hundredths.

So, 9.8 > 9.11.

Apologies for the initial confusion! 😅
使用 curl 号令挪用


您也能够使用 curl 号令去挪用模子:
curl http://localhost:8000/v1/chat/completions \    -H "Content-Type: application/json" \    -d '{        "model": "ds31",        "messages": [            {                "role": "user",                "content": "9.11 and 9.8, which is greater?"            }        ],        "chat_template_kwargs": {            "thinking": true        }    }'

留神事变:
    保证模子效劳已经正在当地运行按照需要调解 thinking 参数值


文档滥觞:vllm DeepSeek-V3.1 Usage Guide


原文手艺文档由 AI 帮忙收拾整顿,转载请说明滥觞
您需要登录后才可以回帖 登录 | 立即注册 qq_login

本版积分规则

avatar

关注0

粉丝1

帖子199

发布主题
阅读排行更多+
用专业创造成效
400-778-7781
周一至周五 9:00-18:00
意见反馈:server@mailiao.group
紧急联系:181-67184787
ftqrcode

扫一扫关注我们

Powered by 职贝云数A新零售门户 X3.5© 2004-2025 职贝云数 Inc.( 蜀ICP备2024104722号 )