Environment Variable Description

After MindIE LLM is installed, the process-level environment variable setting script set_env.sh is provided to automatically set environment variables.

set_env.sh Description

**Table 1** Environment variables in the **set_env.sh** script
Environment Variable	Description	Value Range	Default Value
MindIE LLM environment variables
MINDIE_LLM_HOME_PATH	Home path of MindIE LLM.	N/A	N/A
MINDIE_LLM_RECOMPUTE_THRESHOLD	Recomputation threshold in MindIE LLM.	[0,1]	0.5
PYTORCH_INSTALL_PATH	Installation path of the third-party component Torch. To obtain the path, run the following command: python3 -c 'import torch, os; print( os.path.dirname(os.path.abspath(torch.__file__)))'.	N/A	N/A
PYTORCH_NPU_INSTALL_PATH	Installation path of the third-party component torch_npu. To obtain the value, run python3 -c 'import torch, torch_npu, os; print(os.path.dirname(os.path.abspath(torch_npu.__file__)))'.	N/A	N/A
ATB Models environment variables
ATB_OPERATION_EXECUTE_ASYNC	Asynchronous scheduling of ATB graphs. By default, level-2 pipeline is used. When the number of CPUs is not limited, you can enable level-3 pipeline for performance tuning.	0: disabled 1: level-2 pipeline 2: level-3 pipeline	1
ATB_SPEED_HOME_PATH	(Required) Environment variable of the lib path of ATB Models.	The value must be the lib path of ATB Models.	None
HCCL_INTRA_PCIE_ENABLE	Whether to enable All2All layered communication and INT8 communication features. This function can be enabled only when both HCCL_INTRA_PCIE_ENABLE and HCCL_INTRA_ROCE_ENABLE are enabled. For more information about the two environment variables, see "Collective Communication" in CANN Environment Variable Reference. NOTE: You are advised to enable this function in the Combine INT8 operator scenario of the MoE model in Atlas 800I A2 inference server and Atlas 800I A3 SuperPoD Server to improve performance.	0: disabled 1: enabled	N/A
HCCL_INTRA_ROCE_ENABLE		0: enabled 1: disabled	N/A
Ascend Extension for PyTorch environment variables
MASTER_IP	Host IP address for multi-server serving.	If the value is not empty, the IP address must be valid.	None
MASTER_PORT	Host API for multi-server serving.	If the value is not empty, the port number ranges from 0 to 65535.	None

Other Environment Variables

For details about Server environment variables, see Table 2.

**Table 2** Server environment variables
Parameter	Description	Value Range	Default Value
MIES_INSTALL_PATH	Server installation path.	Path parameters	/usr/local/Ascend/mindie/latest/mindie-service
MIES_CONFIG_JSON_PATH	Path of the config.json file. If the environment variable exists, the value is read. If not, the ${MIES_INSTALL_PATH}/conf/config.json file is read.	Path parameters	N/A
MIES_CONTAINER_IP	Container IP address, which is configured during container deployment. IP address bound to the service-plane RESTful API provided by Endpoint and IP address used for gRPC communication in multi-server inference scenarios. This environment variable needs to be set for multi-server inference.	IPv4 address	N/A
MIES_CONTAINER_MANAGEMENT_IP	IP address bound to the internal RESTful API provided by EndPoint.	IPv4 address	N/A
MIES_MEMORY_DETECTOR_MODE	Whether to detect memory status by dotting.	0: disabled 1: enabled	0
MIES_PROFILER_MODE	Whether to detect performance status by dotting.	0: disabled 1: enabled	0
LD_LIBRARY_PATH	Path of lib.	Path parameters	${MIES_INSTALL_PATH}/lib:${LD_LIBRARY_PATH}
ASCEND_SLOG_PRINT_TO_STDOUT	CANNDEV log printing switch	1: prints logs. 0: writes logs to the ~/ascend directory.	0
ASCEND_GLOBAL_LOG_LEVEL	CANNDEV log level.	0: debug 1: info 2: warn 3: error	3
ASCEND_GLOBAL_EVENT_ENABLE	Whether to enable event logging for applications.	0: disabled 1: enabled	0
HCCL_BUFFSIZE	Size of the buffer that controls shared data between two NPUs.	≥ 1, in MB	120
EP_OPENSSL_PATH	After HTTPS authentication is enabled for EndPoint, this environment variable is used to specify the runtime .so file loaded by OpenSSL. This environment variable is automatically set when the EndPoint module is started. You do not need to manually set it.	Path parameters	${MIES_INSTALL_PATH}/lib
HSECEASY_PATH	After HTTPS authentication is enabled for EndPoint, use the HSECEASY tool to encrypt keys and passwords. This environment variable specifies the path of the runtime .so file loaded by HSECEASY.	Path parameters	${MIES_INSTALL_PATH}/lib
MIES_CERTS_LOG_TO_FILE	Environment variable of the certificate management tool, indicating whether logs are exported to a file.	0: Output logs to a file. 1: Do not output logs to a file.	0
MIES_CERTS_LOG_TO_STDOUT	Environment variable of the certificate management tool, indicating whether to print logs.	0: Do not print logs. 1: Print logs.	1
MIES_CERTS_LOG_LEVEL	Environment variable of the certificate management tool, which specifies the log level.	DEBUG INFO WARNING ERROR FATAL	INFO
MIES_CERTS_LOG_PATH	Environment variable of the certificate management tool, which specifies the log path.	Path parameters	/workspace/log/certs.log
DYNAMIC_AVERAGE_WINDOW_SIZE	Size of the dynamic window for dynamically collecting statistics on the average metric value in the /metrics-json interface.	Positive number	1000
MIES_SERVICE_MONITOR_MODE	Whether to enable the online metric management and control for inference serving. The /metrics interface can be requested only when this function is enabled.	0: disabled 1: enabled	0
LOCAL_CACHE_DIR	Specifies the temporary path for storing images after a multimodal request is received.	Path parameters	~/mindie/cache
TOKENIZER_ENCODE_TIMEOUT	Timeout interval for truncating TOKENIZER Encode, in seconds.	[5, 300]	60
MINDIE_ASYNC_SCHEDULING_ENABLE	Whether to enable asynchronous scheduling.	1: enabled Other values: disabled	N/A

For details about MindIE LLM environment variables, see Table 3.

**Table 3** MindIE LLM environment variables
Environment Variable	Description	Value Range	Default Value
HOST_IP	Host IP address. IP address of the physical machine that provides the inference API. This parameter needs to be configured only for Coordinator.	N/A	N/A
LOCAL_RANK	Local ID of a device.	[0, ${WORLD_SIZE} - 1]	0
MIES_USE_MB_SWAPPER	High-performance swap switch.	0: disabled 1: enabled	0
MINDIE_CHECK_INPUTFILES_PERMISSION	Whether to verify the permission of external files, including the write permission of the file owner and others.	0: The permission of external files is not verified. Other values or None: The permission of external files is verified.	None
MINDIE_LLM_BENCHMARK_ENABLE	Whether to enable the benchmark function of the MindIE LLM module. After the function is enabled, performance data is exported to a specified file path.	0: disabled 1: enabled	0
MINDIE_LLM_BENCHMARK_FILEPATH	Path of the performance data file generated by the benchmark function of the MindIE LLM module.	N/A	"{MINDIE_LLM_HOME_PATH}/logs/benchmark.jsonl"
MINDIE_LLM_BENCHMARK_RESERVING_RATIO	When the size of a performance data file exceeds the upper limit, the new data will overwrite the old data. This environment variable specifies the reserving ratio of old data. The default value is 0.1.	[0.0, 1.0]	0.1
MINDIE_LLM_FRAMEWORK_BACKEND	MindIE LLM backend type. The value can be atb (ATB; default value) or ms (MindSpore).	ATB MS (The value is case insensitive.)	ATB
NPU_DEVICE_IDS	ID of the NPU used.	[0, NPU ID] Example: [0, 1, 2, ...]	N/A
NPU_MEMORY_FRACTION	NPU memory usage, which indicates the ratio of the total graphics memory allocated to the model weights, KV cache, and workspace. The space applied by HCCL and PTA is not included. You are advised to set this parameter to the minimum value that can start the service. The method is as follows: Start the service based on the default configuration. If the service cannot be started, increase the parameter value until the service can just be started. If the service is started successfully, decrease the parameter value until the service can just be started. In a word, a smaller value ensures higher service system stability on the premise that the service can be started properly.	(0.0, 1.0] NOTE: For the Kimi K2 model, the recommended value is 0.9.	For ATB Models, the default value is 1.0. For MindIE LLM, the default value is 0.8.
PERFORMANCE_PREFIX_TREE_ENABLE	Whether to enable the high performance trie-tree of memory_decoding.	0: disabled 1: enabled	0
POST_PROCESSING_SPEED_MODE_TYPE	Postprocessing acceleration mode.	0: Acceleration is disabled. 1: Enables top_p approximate calculation. 2: Enables index acceleration. 3: Enables top_p approximate calculation and index acceleration at the same time.	0
RANK	Global ID of a device.	[0, ${WORLD_SIZE})	0
SOURCE_DATE_EPOCH	Eliminates the bep differences of the .whl package.	N/A	N/A
WORLD_SIZE	Number of devices used for inference.	[1,1048576]	N/A

For details about ATB Models environment variables, see Table 4.

**Table 4** ATB Models environment variables
Environment Variable	Description	Value Range	Default Value
ATB_LLM_BENCHMARK_ENABLE	Whether to enable the function of obtaining performance data.	0: disabled Other values: enabled	0
ATB_LLM_BENCHMARK_FILEPATH	Path for storing performance data.	All values	None
ATB_LLM_ENABLE_AUTO_TRANSPOSE	Whether to enable automatic transpose optimization of the weight right matrix.	None or 1: enabled Other values: disabled	None
ATB_LLM_HCCL_ENABLE	Whether to enable the HCCL communication backend. By default, this function is enabled for the Atlas 300I Duo inference card.	0: disabled 1: enabled	0
ATB_LLM_LCOC_ENABLE	Whether to enable the communication and computation overlapping.	None or 1: enabled Other values: disabled	None
ATB_LLM_LOGITS_SAVE_ENABLE	Whether to save logits information.	0: no Other values: yes	0
ATB_LLM_LOGITS_SAVE_FOLDER	Folder for saving logits information.	All values	None
ATB_LLM_RAZOR_ATTENTION_ENABLE	Whether to enable RA compression.	0: disabled 1: enabled	0
ATB_LLM_RAZOR_ATTENTION_ROPE	Whether to enable the Razor attention compression algorithm of RoPE.	0: disabled 1: enabled	0
ATB_LLM_TOKEN_IDS_SAVE_ENABLE	Whether to save token information.	0: no Other values: yes	0
ATB_LLM_TOKEN_IDS_SAVE_FOLDER	Folder for saving token information.	All values	None
ATB_PROFILING_ENABLE	Whether to collect performance profiling data.	1: yes Other values or None: no	None
ATB_USE_TILING_COPY_STREAM	Whether to enable dual-stream.	1: enabled Other values or None: disabled	None
BIND_CPU	Whether to bind processes running on NPUs to cores based on the CPU affinity.	None or 1: enabled Other values: disabled	None
CPU_BINDING_NUM	Number of cores bound to each device.	[0, Number of CPU cores/Number of devices on NUMA]	None
HCCL_DETERMINISTIC	Deterministic computation of HCCL communication. You are advised to enable this function for multi-server inference.	false: disabled true: enabled	Generally, the value is true and depends on the model.
IS_ALIBI_MASK_FREE	Whether to support Speculate.	1: enabled Other values or None: disabled	None
LCCL_DETERMINISTIC	Deterministic computation of LCCL communication.	0: disabled 1: enabled	Generally, the value is 1 and depends on the model.
LONG_SEQ_ENABLE	Whether to enable the long sequence feature.	1: yes Other values or None: no	None
MINDIE_ACLNN_CACHE_GLOBAL_COUNT	Number of global caches of aclExecutor and the corresponding aclTensor in Plugin Op.	[0, 100)	16
PROFILING_FILEPATH	Path of the profiling files. By default, the profiling files are saved in the profiling folder in the current path.	N/A	N/A
PROFILING_LEVEL	ProfilerLevel.	Level0 Level1 Level2 Level_none	Level0
RESERVED_MEMORY_GB	Size of the graphics memory pool that is dynamically allocated during model running.	[0, 64)	3
MINDIE_ENABLE_EXPERT_HOTPOT_GATHER	Whether to collect expert hotspot information for load balancing.	1: enabled Other values or None: disabled	None
MINDIE_EXPERT_HOTPOT_DUMP_PATH	Path for storing expert hotspot information for load balancing.	All values	None
REMOVE_GENERATION_CONFIG_DICT	After this function is enabled, the model postprocessing parameters are set to the default values (valid only for LLMs).	1: enabled Other values or None: disabled	None

For details about log-related environment variables, see Table 5.

**Table 5** Log-related environment variables
Environment Variable	Description	Value Range	Default Value
MINDIE_LOG_LEVEL	Log level.	DEBUG INFO WARN ERROR CRITICAL	INFO
MINDIE_LOG_PATH	Path for storing logs.	N/A	"mindie/log/debug"
MINDIE_LOG_ROTATE	Size and number of logs to be rotated.	-fs: size of each log file, in MB. The value range is [1, 500]. -r: number of log files that can be written by each process. The value range is [1, 64]. Example: export MINDIE_LOG_ROTATE="-fs 40 -r 2"	-fs: 20 -r: 10 NOTE: PYTHON_LOG_MAXSIZE is compatible with MINDIE_LOG_ROTATE, and PYTHON_LOG_MAXSIZE has a higher priority than the -fs parameter in MINDIE_LOG_ROTATE.
MINDIE_LOG_TO_FILE	Whether to save logs to files. The value 1 indicates that logs are saved to files.	{0, 1, true, false}	true
MINDIE_LOG_TO_STDOUT	Whether to print logs. The value 1 indicates that logs are printed.	{0, 1, true, false}	false
MINDIE_LOG_VERBOSE	Whether to add optional content to logs.	{0, 1, true, false}	true
PYTHON_LOG_MAXSIZE	Maximum size of a single ATB Python log file (unit: byte).	[0, 524288000]	None NOTE: PYTHON_LOG_MAXSIZE will be brought offline at the end of 2026. PYTHON_LOG_MAXSIZE is compatible with MINDIE_LOG_ROTATE, and PYTHON_LOG_MAXSIZE has a higher priority than the -fs parameter in MINDIE_LOG_ROTATE. If PYTHON_LOG_MAXSIZE is not set, the -fs parameter in MINDIE_LOG_ROTATE is used.

For details about the ATB environment variables, see Table 6.

**Table 6** ATB environment variables
Environment Variable	Description	Value Range	Default Value
ASCEND_LAUNCH_BLOCKING	Whether to enable synchronous operator delivery, which is used in the debugging scenario.	0: disabled 1: enabled	0
ASCEND_RT_VISIBLE_DEVICES	Device ID.	[0, Device ID] Example: [0, 1, 2, ...]	N/A
ATB_HOME_PATH	Environment variable of the ATB path. There is no default value, and this parameter is required.	N/A	N/A
ATB_OPSRUNNER_KERNEL_CACHE_GLOABL_COUNT	Number of slots for the global kernel cache. If the number of slots is increased, the cache hit ratio increases, but the retrieval efficiency decreases. If the number of slots is reduced, the retrieval efficiency increases, but the cache hit ratio decreases.	[1, 1024]	16
ATB_OPSRUNNER_KERNEL_CACHE_LOCAL_COUNT	Number of slots for the local kernel cache. If the number of slots is increased, the cache hit ratio increases, but the retrieval efficiency decreases. If the number of slots is reduced, the retrieval efficiency increases, but the cache hit ratio decreases.	[1, 1024]	1
ATB_WORKSPACE_MEM_ALLOC_GLOBAL	Whether to use the global intermediate tensor memory allocation algorithm. After this algorithm is used, the size of the intermediate tensor memory is computed and allocated.	0: disabled 1: enabled	1

For more ATB environment variables, see "Environment Variable Reference" in CANN ATB Development Guide.

For more PyTorch environment variables such as INF_NAN_MODE_ENABLE, TASK_QUEUE_ENABLE, and RANK_TABLE_FILE, see " INF_NAN_MODE_ENABLE" in Environment Variables.
When BIND_CPU is enabled, execute_command is called to run the following command:
execute_command(["npu-smi", "info", "-i", f"{npu_id}", "-t", "memory"]).split("\n")[1:]execute_command(["npu-smi", "info", "-i", f"{npu_id}", "-t", "usages"]).split("\n")[1:]execute_command(["npu-smi", "info", "-m"]).strip().split("\n")[1:]execute_command(["npu-smi", "info", "-t", "board", "-i", f"{device_info.npu_id}", -c", f"{device_info.chip_id}"]).strip().split("\n")execute_command(["lspci", "-s", f"{pcie_no}", "-vvv"]).split("\n")execute_command(["lscpu"]).split("\n")

Parent topic: Appendixes