HcclCommConfig

Description

Defines the configurations (including the buffer size, deterministic computing switch, and communicator name) of a communicator when initializing the communicator with specific configurations.

If deterministic computing is disabled, the results of multiple executions may be different. This is generally caused by asynchronous multi-thread executions during operator implementation, which changes the accumulation sequence of floating point numbers. When deterministic computing is enabled, the same output is generated if an operator is executed for multiple times with the same hardware and input.

By default, deterministic computing does not need to be enabled. However, if the model execution results for multiple times are different or the accuracy is to be optimized, you can enable deterministic computing to assist debugging and optimization. However, after deterministic computing is enabled, the operator execution becomes slow, resulting in performance deterioration.

Prototype

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
const uint32_t HCCL_COMM_CONFIG_INFO_BYTES = 24;
const uint32_t COMM_NAME_MAX_LENGTH = 128;
const uint32_t UDI_MAX_LENGTH = 128; 
typedef struct HcclCommConfigDef {
    char reserved[HCCL_COMM_CONFIG_INFO_BYTES];    /* Reserved field, which cannot be modified. */
    uint32_t hcclBufferSize;                       /* Buffer size of the shared data. The value must be greater than or equal to 1. The default value is 200. The unit is MB. */
    uint32_t hcclDeterministic;                    /* Deterministic computing switch. 0 (default): disabled; 1: enabled. */
    char hcclCommName[COMM_NAME_MAX_LENGTH];       /* Communicator name. The value contains a maximum of 128 characters. If this value is not specified, it is automatically generated by HCCL. */
    char hcclUdi[UDI_MAX_LENGTH];                  /* User-defined information. The value contains a maximum of 128 characters and is empty by default. */
} HcclCommConfig;