Environment Variable Configuration
Table 1 describes the Rec SDK TensorFlow environment variables. To use C/C++ for compilation, you need to set compilation environment variables. For details about how to compile operators in C++, see Table 2.
Environment Variable |
Meaning |
Mandatory/Optional |
Description |
|---|---|---|---|
MXREC_LOG_LEVEL |
Framework log level. |
Optional |
The value can be INFO (default), DEBUG, or ERROR. |
TF_DEVICE |
Specifies whether to combine tables. |
Optional |
The value can be NPU, GPU, CPU, or NONE (default).
|
AclTimeout |
AscendCL timeout interval. |
Optional |
The value ranges from -1 (default) to the maximum value of int32 (2147483647). |
HD_CHANNEL_SIZE |
Depth of the data channel processed by the CPU. |
Optional |
The value ranges from 2 and 8192, and the default value is 40. |
KEY_PROCESS_THREAD_NUM |
Number of KEY_PROCESS threads. |
Optional |
The value ranges from 1 to 10, and the default value is 6. |
MAX_UNIQUE_THREAD_NUM |
Maximum number of UNIQUE threads. |
Optional |
The value ranges from 1 to 8, and the default value is 8. |
FAST_UNIQUE |
Whether to enable a self-implemented encoding algorithm for deduplication. |
Optional |
Valid value: 0 or 1. The default value is 0. Values other than the specified range will cause unexpected behavior.
|
HOT_EMB_UPDATE_STEP |
Update step count of Hot Embedding. |
Optional |
The value ranges from 1 to 1000 (default). |
GLOG_stderrthreshold |
Glog log level. |
Optional |
The value ranges from –2 to 2, and the default value is 0.
|
USE_COMBINE_FAAE |
Whether to collect statistics by combined table. |
Optional |
Valid value: 0 or 1. The default value is 0. Values other than the specified range will cause unexpected behavior.
|
CM_CHIEF_IP |
IP address of the master node. |
Optional |
This parameter is mandatory when the rank table is removed. |
CM_CHIEF_PORT |
Listening port of the master node, for example, 60000. |
Optional |
This parameter is mandatory when the rank table is removed. NOTE:
|
CM_CHIEF_DEVICE |
Device ID of the master node. |
Optional |
Logical ID of the device that collects statistics on the server cluster information on the master node. Value range: [0, Number of devices visible in the environment – 1] This parameter is mandatory when the rank table is removed. |
CM_WORKER_IP |
IP address of the current node. |
Optional |
This parameter is mandatory when the rank table is removed. |
CM_WORKER_SIZE |
Number of devices participating in cluster training. |
Optional |
Value range: [0, 512]. This parameter is mandatory when the rank table is removed. |
RANK_TABLE_FILE |
Collective communication file that adapts to Ascend chips. |
Optional |
Path of the collection communication file. The default value is "". This parameter is mandatory when the rank table is removed. |
ASCEND_VISIBLE_DEVICES |
Devices visible to the Ascend AI Processor, which is used to specify that the program uses only some of devices. |
Mandatory |
You can use this environment variable to specify the NPU device for training. (Run the ls /dev/ | grep davinci* command to query the NPU device of the host.) In addition, you can use the device serial number to specify the NPU device. A single NPU device or a range of NPU devices and use them together. See the following examples:
|
RECORD_KEY_COUNT |
Whether to record keys and the key counts. |
Optional |
The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.
|
LCAL_COMM_ID |
Specifies the primary node for exchanging LCAL metadata. |
Optional |
The format is IP address:Port, based on socket communication. If this parameter is not specified, the default primary communication node is the process corresponding to the minimum rank ID of the current task, and the default port number is 10067. |
LCCL_DETERMINISTIC |
Enables LCCL deterministic computing. |
Optional |
The default value is 0, indicating that LCCL deterministic computing is disabled. If deterministic computing is required, set this parameter to 1. The GatherUss operator ensures that the computing is ordered. Values other than the specified range will cause unexpected behavior. |
USE_SHM_SWAP |
Improves the PCIe throughput performance. |
Optional |
The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.
|
HUGE_TLB_ENABLE |
Hugepage memory |
Optional |
The value can be 0 (default) or 1. Values other than the specified range will cause unexpected behavior.
|
SSD_SAVE_COMPACT_LEVEL |
Compression level when the SSD is saved. |
Optional |
The value range is [0, 2]. The default value is 2.
|
NOTE:
Rec SDK TensorFlow depends on the OMPI_COMM_WORLD_SIZE, OMPI_COMM_WORLD_LOCAL_SIZE, and OMPI_COMM_WORLD_RANK environment variables in the distributed training and inference started by OpenMPI. These environment variables are automatically injected by the OpenMPI initiator. Manual injection is not required. |
|||