openPangu-Ultra-MoE-718B-V1.1今日正式开源，部署指南来啦！-技术干货-昇腾社区

openPangu-Ultra-MoE-718B-V1.1今日正式开源，部署指南来啦！

昇腾部署

发表于 2025/10/15

2025年10月15日，华为正式发布并开源openPangu-Ultra-MoE-718B-V1.1。该模型是基于昇腾 NPU 训练的大规模混合专家（MoE）语言模型，总参数规模达 718B，激活参数量为 39B。该模型在同一架构下融合了“快思考”与“慢思考”两种能力，实现更高效、更智能的推理与决策。

openPangu-Ultra-MoE-718B-V1.1已上线GitCode社区：

模型地址：https://ai.gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1

Int8量化版本也同步开源：https://gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1-Int8

快速上手openPangu-Ultra-MoE-718B-V1.1模型

1.环境准备

硬件规格

Atlas 800T A2 (64GB, >=32卡)，驱动与固件安装包获取请参照：

[[Atlas 800T A2](https://www.hiascend.com/hardware/firmware-drivers/community?product=4&model=26&cann=8.2.RC1.alpha003&driver=Ascend+HDK+25.0.RC1)]

软件环境

- 方式一：基于裸机环境安装以下配套软件

- 操作系统：Linux（推荐openEuler>=24.03）

- CANN==8.1.RC1，安装准备及流程请参照

[[CANN Install](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/82RC1alpha002/softwareinst/instg/instg_0001.html?Mode=PmIns&OS=Ubuntu&Software=cannToolKit)]

- python==3.10

- torch==2.1.0

- torch-npu==2.1.0.post12

- transformers>=4.48.2

- 方式二：从 docker 镜像启动容器

参考[[Docker使用指南](doc/docker.md)]

以上软件配套经过验证，理论可以支持更高的版本，如有疑问，可以提交 issue。

2.推理权重转换

本次样例 openPangu-Ultra-MoE-718B-V1.1 推理采用 Tensor Parallel 并行策略，叠加昇腾 NPU 融合大算子，需要提前对 safetensors 权重进行切分，下述内容提供32卡并行推理的权重切分示例，切分后的权重会保存在`model/`目录下：

```bash
cd inference
bash split_weight.sh
```

3.推理样例

openPangu-Ultra-MoE-718B-V1.1 在 Atlas 800T A2 上4机32卡 bfloat16 推理示例，主节点选取节点 IP0：

```bash
cd inference
# 主节点IP0:  ${NNODES} ${NODE_RANK} ${NPROC_PER_NODE} ${MASTER_ADDR} ${PROMPT}
bash generate.sh 4 0 8 IP0 "3*7=?"
# 从节点IP1
bash generate.sh 4 1 8 IP0 "3*7=?"
# 从节点IP2
bash generate.sh 4 2 8 IP0 "3*7=?"
# 从节点IP3
bash generate.sh 4 3 8 IP0 "3*7=?"
```

模型默认为慢思考模式，可以通过以下手段切换至快思考模式：如`generate.py`示例中`fast_thinking_template`所示，在用户输入结尾添加` /no_think`标记可以将当前轮次切换至快思考模式。

4.使用推理框架

- Vllm_ascend：参考

[[vllm_ascend_for_openPangu_ultra_moe_718b](doc/vllm_ascend_for_openpangu_ultra_moe_718b.md)]

查看更多模型信息，请关注模型仓库：

https://ai.gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1

https://gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1-Int8

本页内容

快速上手openPangu-Ultra-MoE-718B-V1.1模型
1.环境准备
2.推理权重转换
3.推理样例
4.使用推理框架

快速上手openPangu-Ultra-MoE-718B-V1.1模型

1.环境准备

2.推理权重转换

3.推理样例

4.使用推理框架

关于昇腾

新闻与活动

交流与资讯

支持与服务

开源社区