torch.distributed

若API“是否支持”“是”“限制与说明”“-”,说明此API和原生API支持度保持一致。

API名称

是否支持

限制与说明

torch.distributed.is_available

-

torch.distributed.init_process_group

当pg_options函数传入类型为“torch_npu._C._distributed_c10d.ProcessGroupHCCL.Options()”时,配置该变量属性hccl_config可控制HCCL通信域缓存区大小。具体示例可参考《PyTorch 训练模型迁移调优指南》的“hccl_buffer_size”章节,配置变量属性hccl_config的group_name字段可以设置HCCL通信域的通信组自定义名称,取值为长度不超过32的字符串。

torch.distributed.is_initialized

-

torch.distributed.is_mpi_available

-

torch.distributed.is_nccl_available

-

torch.distributed.is_gloo_available

-

torch.distributed.is_torchelastic_launched

-

torch.distributed.Backend

-

torch.distributed.Backend.register_backend

-

torch.distributed.get_backend

-

torch.distributed.get_rank

-

torch.distributed.get_world_size

-

torch.distributed.Store

-

torch.distributed.TCPStore

-

torch.distributed.HashStore

-

torch.distributed.FileStore

-

torch.distributed.PrefixStore

-

torch.distributed.Store.set

-

torch.distributed.Store.get

-

torch.distributed.Store.add

-

torch.distributed.Store.compare_set

-

torch.distributed.Store.wait

-

torch.distributed.Store.num_keys

-

torch.distributed.Store.delete_key

-

torch.distributed.Store.set_timeout

-

torch.distributed.new_group

当pg_options函数传入类型为“torch_npu._C._distributed_c10d.ProcessGroupHCCL.Options()”时,配置该变量属性hccl_config可控制HCCL通信域缓存区大小。具体示例可参考《PyTorch 训练模型迁移调优指南》的“hccl_buffer_size”章节,配置变量属性hccl_config的group_name字段可以设置HCCL通信域的通信组自定义名称,取值为长度不超过32的字符串。

torch.distributed.get_group_rank

-

torch.distributed.get_global_rank

-

torch.distributed.get_process_group_ranks

-

torch.distributed.device_mesh.DeviceMesh

-

torch.distributed.send

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.recv

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.isend

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.irecv

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.batch_isend_irecv

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.P2POp

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.broadcast

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.broadcast_object_list

-

torch.distributed.all_reduce

支持fp16, fp32, int32, int64, bool

torch.distributed.reduce

支持bf16,fp16,fp32,uint8,int8,int32,int64,bool

torch.distributed.all_gather

支持bf16,fp16,fp32,int8,int32,bool

torch.distributed.all_gather_into_tensor

支持bf16,fp16,fp32,int8,int32,bool

world size不支持3,5,6,7

torch.distributed.all_gather_object

-

torch.distributed.gather

支持bf16,fp16,fp32,int8,int32,bool

torch.distributed.gather_object

支持的输入类型为Python Object对象

torch.distributed.scatter

支持bf16,fp16,fp32,fp64,uint8,int8,int16,int32,int64,bool

torch.distributed.scatter_object_list

不涉及dtype参数

torch.distributed.reduce_scatter

支持bf16,fp16,fp32,int8,int32

torch.distributed.reduce_scatter_tensor

支持bf16,fp16,fp32,int8,int32

world size不支持3,5,6,7

torch.distributed.all_to_all_single

支持fp32

torch.distributed.all_to_all

支持fp32

torch.distributed.barrier

-

torch.distributed.monitored_barrier

-

torch.distributed.ReduceOp

支持bf16,fp16,fp32,uint8,int8,int32,int64,bool

torch.distributed.reduce_op

支持bf16,fp16,fp32,uint8,int8,int32,int64

torch.distributed.DistBackendError

-