HCCL_DEBUG_CONFIG

Description

Configures whether run logs (that is, logs in $HOME/ascend/log/run) contain the detailed running information about the specific HCCL submodule. Currently, the following four configuration items are supported: ALG or alg (algorithm orchestration module), TASK or task (task orchestration module), RESOURCE or resource (resource management module, including resource allocation and release operations), and AIV_OPS_EXC or aiv_ops_exc (AIV operator log printing module, including communication memory operations and resource synchronization operations during operator execution).

This environment variable can be configured in either of the following ways:
  • Forward configuration: One or more modules can be configured. Use commas (,) to separate modules. TASK (or task), ALG (or alg), RESOURCE (or resource), and AIV_OPS_EXC (or aiv_ops_exc) are case insensitive.
    # Record the running information about the task module in run logs.
    export HCCL_DEBUG_CONFIG="TASK" 
    # Record the running information about the alg, task, and resource modules in run logs.
    export HCCL_DEBUG_CONFIG="alg,task,resource,aiv_ops_exc" 
  • Reverse configuration: Add ^ before the first module name, indicating that the detailed running information about other modules except the configured submodules is recorded in run logs.
    # Record the running information about all modules except the task module in run logs. (In the current version, only the running information about the alg, resource, and aiv_ops_exc modules is recorded.)
    export HCCL_DEBUG_CONFIG="^task"
    # Record the running information about all modules except the task and alg modules in run logs. (In the current version, the running information about the resource and aiv_ops_exc modules is recorded.)
    export HCCL_DEBUG_CONFIG="^task,alg"

Note: When configuring environment variables, do not add redundant spaces. Otherwise, the configuration is invalid. For example, if there are redundant spaces before and after task in export HCCL_DEBUG_CONFIG="alg, task ", the environment variable configuration is invalid.

Example

export HCCL_DEBUG_CONFIG="ALG,TASK,RESOURCE,AIV_OPS_EXC" 

Restrictions

None

Applicability

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products (For Atlas A2 training products/Atlas A2 inference products, only the Atlas 800T A2 training server, Atlas 900 A2 PoD cluster basic unit, and Atlas 200T A2 Box16 heterogeneous subrack are supported.)