Environment Variable Configuration

Table 1 describes the Rec SDK TensorFlow environment variables. To use C/C++ for compilation, you need to set compilation environment variables. For details about how to compile operators in C++, see Table 2.

Table 1 Environment variables

Environment Variable

Meaning

Mandatory/Optional

Description

MXREC_LOG_LEVEL

Framework log level.

Optional

The value can be INFO (default), DEBUG, or ERROR.

TF_DEVICE

Specifies whether to combine tables.

Optional

The value can be NPU, GPU, CPU, or NONE (default).

  • If the value is set to GPU, CPU, or NONE, tables are not combined.
  • If the value is set to NPU, tables are combined.

AclTimeout

AscendCL timeout interval.

Optional

The value ranges from -1 (default) to the maximum value of int32 (2147483647).

HD_CHANNEL_SIZE

Depth of the data channel processed by the CPU.

Optional

The value ranges from 2 and 8192, and the default value is 40.

KEY_PROCESS_THREAD_NUM

Number of KEY_PROCESS threads.

Optional

The value ranges from 1 to 10, and the default value is 6.

MAX_UNIQUE_THREAD_NUM

Maximum number of UNIQUE threads.

Optional

The value ranges from 1 to 8, and the default value is 8.

FAST_UNIQUE

Whether to enable a self-implemented encoding algorithm for deduplication.

Optional

Valid value: 0 or 1. The default value is 0. Values other than the specified range will cause unexpected behavior.

  • 0: The upgrade does not take effect.
  • 1: The upgrade takes effect.

HOT_EMB_UPDATE_STEP

Update step count of Hot Embedding.

Optional

The value ranges from 1 to 1000 (default).

GLOG_stderrthreshold

Glog log level.

Optional

The value ranges from –2 to 2, and the default value is 0.

  • -2: TRACE
  • -1: DEBUG
  • 0: INFO
  • 1: WARNING
  • 2: ERROR

USE_COMBINE_FAAE

Whether to collect statistics by combined table.

Optional

Valid value: 0 or 1. The default value is 0. Values other than the specified range will cause unexpected behavior.

  • If USE_COMBINE_FAAE is set to 0, statistics are collected separately for each table. The count records of keys in each table are independent of each other.
  • If USE_COMBINE_FAAE is set to 1, statistics are collected for all tables together. Multiple tables maintain one count record.

CM_CHIEF_IP

IP address of the master node.

Optional

This parameter is mandatory when the rank table is removed.

CM_CHIEF_PORT

Listening port of the master node, for example, 60000.

Optional

This parameter is mandatory when the rank table is removed.

NOTE:
  • You can run the following command to specify a group of local reserved ports. These ports will be reserved by the system and will not be used by other applications.
    sysctl -w net.ipv4.ip_local_reserved_ports=60000-60015

    Then, set CM_CHIEF_PORT to the port within the range specified by the preceding command.

  • Check whether the port is in use.
    netstat -anp | grepPort number

    If the port is occupied, the ID and name of the process that occupies the port are displayed.

CM_CHIEF_DEVICE

Device ID of the master node.

Optional

Logical ID of the device that collects statistics on the server cluster information on the master node.

Value range: [0, Number of devices visible in the environment1] This parameter is mandatory when the rank table is removed.

CM_WORKER_IP

IP address of the current node.

Optional

This parameter is mandatory when the rank table is removed.

CM_WORKER_SIZE

Number of devices participating in cluster training.

Optional

Value range: [0, 512]. This parameter is mandatory when the rank table is removed.

RANK_TABLE_FILE

Collective communication file that adapts to Ascend chips.

Optional

Path of the collection communication file. The default value is "". This parameter is mandatory when the rank table is removed.

ASCEND_VISIBLE_DEVICES

Devices visible to the Ascend AI Processor, which is used to specify that the program uses only some of devices.

Mandatory

You can use this environment variable to specify the NPU device for training. (Run the ls /dev/ | grep davinci* command to query the NPU device of the host.) In addition, you can use the device serial number to specify the NPU device. A single NPU device or a range of NPU devices and use them together. See the following examples:

  • ASCEND_VISIBLE_DEVICES=0 indicates that device 0 (/dev/davinci0) is mounted to the container.
  • ASCEND_VISIBLE_DEVICES=1,3 indicates that devices 1 and 3 are mounted to the container.
  • ASCEND_VISIBLE_DEVICES=0-2 indicates that devices 0 to 2 (including devices 0 and 2) are mounted to the container. The effect is the same as that of

    ASCEND_VISIBLE_DEVICES=0,1,2.

  • ASCEND_VISIBLE_DEVICES=0-2,4 indicates that devices 0 to 2 and device 4 are mounted to the container. The effect is the same as that of ASCEND_VISIBLE_DEVICES=0,1,2,4.

RECORD_KEY_COUNT

Whether to record keys and the key counts.

Optional

The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.

  • 0: Disable recording of the key and count information.
  • 1: Enable recording of the key and count information.

LCAL_COMM_ID

Specifies the primary node for exchanging LCAL metadata.

Optional

The format is IP address:Port, based on socket communication. If this parameter is not specified, the default primary communication node is the process corresponding to the minimum rank ID of the current task, and the default port number is 10067.

LCCL_DETERMINISTIC

Enables LCCL deterministic computing.

Optional

The default value is 0, indicating that LCCL deterministic computing is disabled.

If deterministic computing is required, set this parameter to 1. The GatherUss operator ensures that the computing is ordered.

Values other than the specified range will cause unexpected behavior.

USE_SHM_SWAP

Improves the PCIe throughput performance.

Optional

The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.

  • 0: Disable this function.
  • 1: Enable this function.

HUGE_TLB_ENABLE

Hugepage memory

Optional

The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.

  • 0: Disable this function.
  • 1: Enable this function.

SSD_SAVE_COMPACT_LEVEL

Compression level when the SSD is saved.

Optional

The value range is [0, 2]. The default value is 2.

  • 0: No compression.
  • 1: Compress only files that exceed the threshold.
  • 2: Compress all files.
NOTE:

Rec SDK TensorFlow depends on the OMPI_COMM_WORLD_SIZE, OMPI_COMM_WORLD_LOCAL_SIZE, and OMPI_COMM_WORLD_RANK environment variables in the distributed training and inference started by OpenMPI. These environment variables are automatically injected by the OpenMPI initiator. Manual injection is not required.

Table 2 Environment variables for C++ compilation

Environment Variable

Meaning

Mandatory/Optional

Description

CC

C compiler

Mandatory

Set it to gcc.

CXX

C++ compiler

Mandatory

Set it to g++.