Static Graph Model Download Support#
Static graph models now support DeepSeek series, Qwen series, llama series, etc. The detailed supported list is as follows:
DeepSeekV2#
Model Name | Static Graph Download model_name |
---|---|
deepseek-ai/DeepSeek-V2-Chat | π§ |
deepseek-ai/DeepSeek-V2-Lite-Chat | π§ |
DeepSeekV3#
Model Name | Static Graph Download model_name |
---|---|
deepseek-ai/DeepSeek-V3 | π§ |
DeepSeekR1#
Deployment Hardware Requirements:
Except MTP models and Fp8 models, the minimum supported architecture is SM80 (Machines: A100/A800) requiring CUDA 11.8+
DeepSeek-R1-MTP and Fp8 models require SM90 (Machines: H800) and CUDA 12.4+
Model Name | Precision | MTP | Nodes | Static Graph Download model_name |
---|---|---|---|---|
deepseek-ai/DeepSeek-R1 | weight_only_int4 | No | 1 | deepseek-ai/DeepSeek-R1/weight_only_int4 |
deepseek-ai/DeepSeek-R1 | weight_only_int4 | Yes | 1 | deepseek-ai/DeepSeek-R1-MTP/weight_only_int4 |
deepseek-ai/DeepSeek-R1 | weight_only_int8 | No | 2 | deepseek-ai/DeepSeek-R1-2nodes/weight_only_int8 |
deepseek-ai/DeepSeek-R1 | weight_only_int8 | Yes | 2 | deepseek-ai/DeepSeek-R1-MTP-2nodes/weight_only_int8 |
deepseek-ai/DeepSeek-R1 | a8w8_fp8 | No | 2 | deepseek-ai/DeepSeek-R1-2nodes/a8w8_fp8 |
deepseek-ai/DeepSeek-R1 | a8w8_fp8 | Yes | 2 | deepseek-ai/DeepSeek-R1-MTP-2nodes/a8w8_fp8 |
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/weight_only_int8 |
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/weight_only_int8 |
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/weight_only_int8 |
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/weight_only_int8 |
deepseek-ai/DeepSeek-R1-Distill-Llama-8B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Llama-8B/weight_only_int8 |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B | weight_only_int8 | - | - | deepseek-ai/DeepSeek-R1-Distill-Llama-70B/weight_only_int8 |
LLaMA#
Model Name | Static Graph Download model_name |
---|---|
facebook/llama-7b | π§ |
facebook/llama-13b | π§ |
facebook/llama-30b | π§ |
facebook/llama-65b | π§ |
Llama2#
Model Name | Static Graph Download model_name |
---|---|
meta-llama/Llama-2-7b | π§ |
meta-llama/Llama-2-7b-chat | π§ |
meta-llama/Llama-2-13b | π§ |
meta-llama/Llama-2-13b-chat | π§ |
meta-llama/Llama-2-70b | π§ |
meta-llama/Llama-2-70b-chat | π§ |
Llama3#
Deployment Hardware Requirements:
Append-Attn:
Minimum supported architecture: SM80 (A100/A800)
Requires CUDA 11.8+
Block-Attn:
Minimum supported architecture: SM70 (V100)
Requires CUDA 11.8+
Model Name | Static Graph Download model_name |
---|---|
meta-llama/Meta-Llama-3-8B | π§ |
meta-llama/Meta-Llama-3-8B-Instruct | meta-llama/Meta-Llama-3-8B-Instruct-Append-Attn/bfloat16,meta-llama/Meta-Llama-3-8B-Instruct-Block-Attn/float16 |
meta-llama/Meta-Llama-3-70B | π§ |
meta-llama/Meta-Llama-3-70B-Instruct | π§ |
Llama3.1#
Model Name | Static Graph Download model_name |
---|---|
meta-llama/Meta-Llama-3.1-8B | π§ |
meta-llama/Meta-Llama-3.1-8B-Instruct | π§ |
meta-llama/Meta-Llama-3.1-70B | π§ |
meta-llama/Meta-Llama-3.1-70B-Instruct | π§ |
meta-llama/Meta-Llama-3.1-405B | π§ |
meta-llama/Meta-Llama-3.1-405B-Instruct | π§ |
meta-llama/Llama-Guard-3-8B | π§ |
Llama3.2#
Model Name | Static Graph Download model_name |
---|---|
meta-llama/Llama-3.2-1B | π§ |
meta-llama/Llama-3.2-1B-Instruct | π§ |
meta-llama/Llama-3.2-3B | π§ |
meta-llama/Llama-3.2-3B-Instruct | π§ |
meta-llama/Llama-Guard-3-1B | π§ |
Llama3.3#
Model Name | Static Graph Download model_name |
---|---|
meta-llama/Llama-3.3-70B-Instruct | π§ |
Mixtral#
Model Name | Static Graph Download model_name |
---|---|
mistralai/Mixtral-8x7B-Instruct-v0.1 | π§ |
Qwen#
Model Name | Static Graph Download model_name |
---|---|
qwen/qwen-7b | π§ |
qwen/qwen-7b-chat | π§ |
qwen/qwen-14b | π§ |
qwen/qwen-14b-chat | π§ |
qwen/qwen-72b | π§ |
qwen/qwen-72b-chat | π§ |
Qwen1.5#
Deployment Hardware Requirements:
Block-Attn:
Minimum supported architecture: SM70 (V100)
Requires CUDA 11.8+
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen1.5-0.5B | Qwen/Qwen1.5-0.5B-Block-Attn/bfloat16,Qwen/Qwen1.5-0.5B-Block-Attn/float16 |
Qwen/Qwen1.5-0.5B-Chat | π§ |
Qwen/Qwen1.5-1.8B | π§ |
Qwen/Qwen1.5-1.8B-Chat | π§ |
Qwen/Qwen1.5-4B | π§ |
Qwen/Qwen1.5-4B-Chat | π§ |
Qwen/Qwen1.5-7B | π§ |
Qwen/Qwen1.5-7B-Chat | π§ |
Qwen/Qwen1.5-14B | π§ |
Qwen/Qwen1.5-14B-Chat | π§ |
Qwen/Qwen1.5-32B | π§ |
Qwen/Qwen1.5-32B-Chat | π§ |
Qwen/Qwen1.5-72B | π§ |
Qwen/Qwen1.5-72B-Chat | π§ |
Qwen/Qwen1.5-110B | π§ |
Qwen/Qwen1.5-110B-Chat | π§ |
Qwen/Qwen1.5-MoE-A2.7B | π§ |
Qwen/Qwen1.5-MoE-A2.7B-Chat | π§ |
Qwen2#
Deployment Hardware Requirements:
Append-Attn:
Minimum supported architecture: SM80 (A100/A800)
Requires CUDA 11.8+
Block-Attn:
Minimum supported architecture: SM70 (V100)
Requires CUDA 11.8+
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen2-0.5B | π§ |
Qwen/Qwen2-0.5B-Instruct | π§ |
Qwen/Qwen2-1.5B | π§ |
Qwen/Qwen2-1.5B-Instruct | Qwen/Qwen2-1.5B-Instruct-Append-Attn/bfloat16, Qwen/Qwen2-1.5B-Instruct-Block-Attn/float16 |
Qwen/Qwen2-7B | π§ |
Qwen/Qwen2-7B-Instruct | π§ |
Qwen/Qwen2-72B | π§ |
Qwen/Qwen2-72B-Instruct | π§ |
Qwen/Qwen2-57B-A14B | π§ |
Qwen/Qwen2-57B-A14B-Instruct | π§ |
Qwen2-Math#
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen2-Math-1.5B | π§ |
Qwen/Qwen2-Math-1.5B-Instruct | π§ |
Qwen/Qwen2-Math-7B | π§ |
Qwen/Qwen2-Math-7B-Instruct | π§ |
Qwen/Qwen2-Math-72B | π§ |
Qwen/Qwen2-Math-72B-Instruct | π§ |
Qwen/Qwen2-Math-RM-72B | π§ |
Qwen2.5#
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen2.5-0.5B | π§ |
Qwen/Qwen2.5-0.5B-Instruct | π§ |
Qwen/Qwen2.5-1.5B | π§ |
Qwen/Qwen2.5-1.5B-Instruct | π§ |
Qwen/Qwen2.5-3B | π§ |
Qwen/Qwen2.5-3B-Instruct | π§ |
Qwen/Qwen2.5-7B | π§ |
Qwen/Qwen2.5-7B-Instruct | π§ |
Qwen/Qwen2.5-14B | π§ |
Qwen/Qwen2.5-14B-Instruct | π§ |
Qwen/Qwen2.5-32B | π§ |
Qwen/Qwen2.5-32B-Instruct | π§ |
Qwen/Qwen2.5-72B | π§ |
Qwen/Qwen2.5-72B-Instruct | π§ |
Qwen2.5-Math#
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen2.5-Math-1.5B | π§ |
Qwen/Qwen2.5-Math-1.5B-Instruct | π§ |
Qwen/Qwen2.5-Math-7B | π§ |
Qwen/Qwen2.5-Math-7B-Instruct | π§ |
Qwen/Qwen2.5-Math-72B | π§ |
Qwen/Qwen2.5-Math-72B-Instruct | π§ |
Qwen/Qwen2.5-Math-RM-72B | π§ |
Qwen2.5-Coder#
Model Name | Static Graph Download model_name |
---|---|
Qwen/Qwen2.5-Coder-1.5B | π§ |
Qwen/Qwen2.5-Coder-1.5B-Instruct | π§ |
Qwen/Qwen2.5-Coder-7B | π§ |
Qwen/Qwen2.5-Coder-7B-Instruct | π§ |