昇腾社区首页
中文
注册
开发者
下载

PD分离部署经典配置参数

目前仅支持配置经典配置参数,根据填写的max_seq_len最大序列长度参数(对应下表中maxSeqLen参数)和D实例个数自动填充其他参数。

使用Atlas 800I A2 推理服务器时的经典配置参数

表1 Atlas 800I A2 推理服务器关键参数说明

参数类型

参数名

上下文序列长度

16k

64k

128k

P实例参数(mindie_server_prefill_config)

maxSeqLen

18000

68000

134000

maxInputTokenLen

18000

68000

134000

dp

2

1

1

cp

1

2

2

tp

8

8

8

sp

1

8

8

pp

1

1

1

moe_ep

4

16

16

moe_tp

4

1

1

ep_level

1

1

1

MTP

开启

开启

关闭

enable_init_routing_cutoff

false

true

true

topk_scaling_factor

不生效

0.25

0.25

maxPrefillTokens

18000

68000

134000

D实例参数(mindie_server_decode_config)

maxSeqLen

18000

68000

134000

maxInputTokenLen

18000

68000

134000

dp

D实例为4节点:32

D实例为8节点:64

D实例为4节点:32

D实例为8节点:64

D实例为4节点:32

D实例为8节点:64

tp

1

1

1

sp

1

1

1

cp

1

1

1

pp

1

1

1

moe_ep

D实例为4节点:32

D实例为8节点:64

D实例为4节点:32

D实例为8节点:64

D实例为4节点:32

D实例为8节点:64

moe_tp

1

1

1

ep_level

2

2

2

MTP

开启

开启

关闭

maxPrefillTokens

18000

68000

134000

maxIterTimes

18000

68000

134000

表2 Atlas 800I A2 推理服务器智算节点典型配置

节点配置

PD分离配置

交换机选型参考

8台+2台+1台热设备

2*2P+1*4D+2台(双机)+1台在线热备份

交换机规格参考:32*400G,如XH9210:Leaf 3台,Spine 2台

16台+1台热设备

4*2P+2*4D+1台在线热备份

交换机规格参考:32*400G,如XH9210:Leaf 5台 ,Spine 4台

N*16台

N*(4*2P+1*8D)

按16节点最佳性能(EP64)线性扩展

交换机规格参考:32*400G,如XH9210:Leaf 32台 ,Spine 16台以N=8,共1024NPU为例

大规模专家并行方案采用的Atlas 800I A2 推理服务器仅支持Atlas 800I A2 推理服务器(64GB HCCS款),且NPU片上内存必须为64G,NPU网口光模块必须为200G。

使用Atlas 800I A3 超节点服务器时的经典配置参数

表3 Atlas 800I A3 超节点服务器设备关键参数说明

参数类型

参数名

上下文序列长度

16k

64k

128k

P实例参数(mindie_server_prefill_config)

maxSeqLen

18000

68000

134000

maxInputTokenLen

18000

68000

134000

dp

2

1

1

cp

1

2

2

tp

8

8

8

sp

1

8

8

pp

1

1

1

moe_ep

16

16

16

moe_tp

1

1

1

ep_level

2

2

2

MTP

开启

开启

关闭

maxPrefillTokens

18000

68000

134000

D实例参数(mindie_server_decode_config)

maxSeqLen

18000

68000

134000

maxInputTokenLen

18000

68000

134000

dp

D实例为4节点:64

D实例为8节点:128

D实例为4节点:64

D实例为8节点:128

D实例为4节点:64

D实例为8节点:128

tp

1

1

1

sp

1

1

1

cp

1

1

1

pp

1

1

1

moe_ep

D实例为4节点:64

D实例为8节点:128

D实例为4节点:64

D实例为8节点:128

D实例为4节点:64

D实例为8节点:128

moe_tp

1

1

1

ep_level

2

2

2

MTP

开启

开启

关闭

maxPrefillTokens

18000

68000

134000

maxIterTimes

18000

68000

134000

表4 Atlas 800I A3 超节点服务器智算节点典型配置

节点配置

PD分离配置

总线网络交换机(L2)数量

8台+1台A3冗余节点(可选)

4*1P+2*2D+1台A3冗余节点(可选)

14

16台+1台A3冗余节点(可选)

8*1P+2*4D+1台A3冗余节点(可选)

28

32台+1台A3冗余节点(可选)

16*1P+4*4D+1台A3冗余节点(可选)

56

48台

24*1P+6*4D

56

N*48台

N*(24*1P+6*4D)

N*56

证书配置(可选)

证书的参数配置文件存放于$HOME/ascend-deployer/ascend_deployer/group_vars/master/tls_config.yaml,文件内容示例如下。

# group_vars/tls_config.yaml
tls_config:
  tls_enable: false
  kmc_ksf_master: "./security/master/tools/pmt/master/ksfa"
  kmc_ksf_standby: "./security/standby/tools/pmt/standby/ksfb"
  infer_tls_items:
    ca_cert: "./security/infer/security/certs/ca.pem"
    tls_cert: "./security/infer/security/certs/cert.pem"
    tls_key: "./security/infer/security/keys/cert.key.pem"
    tls_passwd: "./security/infer/security/pass/key_pwd.txt"
    tls_crl: "infer"
  management_tls_items:
    ca_cert: "./security/management/security/certs/ca.pem"
    tls_cert: "./security/management/security/certs/cert.pem"
    tls_key: "./security/management/security/keys/cert.key.pem"
    tls_passwd: "./security/management/security/pass/key_pwd.txt"
    tls_crl: "management"

  # Atlas 800I A2 推理服务器的场景不需要配置 ccae_tls_enable 和 ccae_tls_items
  ccae_tls_enable: false
  ccae_tls_items:
    ca_cert: "./security/ccae/security/certs/ca.pem"
    tls_cert: "./security/ccae/security/certs/cert.pem"
    tls_key: "./security/ccae/security/keys/cert.key.pem"
    tls_passwd: "./security/ccae/security/pass/key_pwd.txt"
    tls_crl: "ccae"
  cluster_tls_enable: false
  cluster_tls_items:
    ca_cert: "./security/clusterd/security/certs/ca.pem"
    tls_cert: "./security/clusterd/security/certs/cert.pem"
    tls_key: "./security/clusterd/security/keys/cert.key.pem"
    tls_passwd: "./security/clusterd/security/pass/key_pwd.txt"
    tls_crl: "clusterd"
  etcd_server_tls_enable: false
  etcd_server_tls_items:
    ca_cert: "./security/etcd_server/security/certs/ca.pem"
    tls_cert: "./security/etcd_server/security/certs/cert.pem"
    tls_key: "./security/etcd_server/security/keys/cert.key.pem"
    tls_passwd: "./security/etcd_server/security/pass/key_pwd.txt"
    kmc_ksf_master: "./security/etcd_server/tools/pmt/master/ksfa"
    kmc_ksf_standby: "./security/etcd_server/tools/pmt/standby/ksfb"
    tls_crl: ""

如何配置及使用证书,请参见《MindIE Motor开发指南》中的“集群服务部署 > PD分离服务部署 > 安装部署 > 配置自动生成证书”章节