Glossary

**Table 1** Glossary
Glossary	Full Form	Description
A
AI	Artificial Intelligence	Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans.
AIPP	Artificial Intelligence Pre-Processing	AI pre-processing (AIPP) implements AI Core–based image preprocessing including image resizing, color space conversion (CSC), and mean subtraction and factor multiplication (for pixel modification), prior to model inference.
Ascend EP	Ascend Endpoint	Ascend Endpoint (Ascend EP) are Ascend AI Processors that serve as secondary devices, for example, PCIe accelerator cards. They work with the primary devices (x86 or Arm servers) for purposes such as inference, training, and image recognition.
Ascend RC	Ascend Root Complex	Ascend Root Complex (Ascend RC) are Ascend AI Processors that serve as the primary devices, for example, the Atlas 200 DK. They provide the host control function and are mainly applicable to mobile devices.
AscendCL	Ascend Computing Language	Ascend Computing Language (AscendCL) provides a collection of C APIs for users to develop deep neural network (DNN) apps for target recognition and image classification, ranging from device, context, stream, and memory management, to model and operator loading and execution, as well as media data processing.
ASHA	Asynchronous Successive Halving Algorithm	Asynchronous Successive Halving Algorithm (ASHA) is a hyperparameter optimization algorithm based on dynamic resource allocation. The basic idea is to conduct massive parallel training of hyperparameters with fewer training iterations per epoch. It evaluates and ranks all hyperparameters, and applies early stopping to trainings for all hyperparameters ranked in the lower half. Then the next epoch of evaluation is performed on the remaining hyperparameters. The evaluation is again halved until the optimization goal is achieved.
ATC	Ascend Tensor Compiler	Ascend Tensor Compiler (ATC) Converts network models under open-source frameworks, such as Caffe and TensorFlow, into offline models supported by Ascend AI Processors. Implements operator scheduling tuning, weight data rearrangement, and memory usage tuning during model conversion. Supports operator build.
AutoML	Automated Machine Learning	Automated machine learning (AutoML) refers to a series of automation algorithms, including feature extraction, model selection, and parameter optimization. It automatically trains valuable models.
B
BOHB	Bayesian Optimization and Hyperband	Bayesian Optimization and Hyperband (BOHB) mixes the Hyperband algorithm and Bayesian optimization. It uses the Hyperband capability to sample many configurations with a small budget to explore quickly and efficiently the hyperparameter search space and get promising configurations. Then it uses the Bayesian optimizer predictive power to propose good configurations close to the optimum.
BOSS	Bayesian Optimization via Sub-Sampling	Bayesian Optimization via Sub-Sampling (BOSS) is a general hyperparameter optimization algorithm based on the Bayesian optimization. It is used for efficient hyperparameter search under the restricted computing resources setting.
BP Point	Backpropagation Point	Backpropagation Point (BP Point) refers to the end position of an inverse operator in the iterative trajectory of a training network.
C
CPU	Central Processing Unit	Central processing unit (CPU) is one of the main parts of a computer apart from internal memory and input and output devices. It interprets computer instructions and processes data in computer software.
D
DDR	Double Data Rate	In computing, a computer bus operating with double data rate (DDR) transfers data on both the rising and falling edges of the clock signal.
DiffThd	Difference Threshold	-
DSL	Domain-Specific Language	Domain-Specific Language (DSL) is an operator development method. Users only need to use DSL APIs to express the computation process. Subsequent operator scheduling, optimization, and compilation can be completed by using existing APIs within a few clicks.
DVPP	Digital Vision Pre-Processing	Digital vision pre-processing (DVPP) provides operations such as decoding and scaling of videos and images in specific formats, and encodes and outputs processed videos and images.
F
FP Point	Forward Propagation Point	Forward propagation point (FP Point) refers to the start position of a forward operator in the iterative trajectory of a training network.
FpDiff	Floating-point Difference	-
G
GDB	GNU Debugger	The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems.
GE	Graph Engine	Graph Engine (GE) provides a set of secure and easy-to-use APIs for graph/operator intermediate representation (IR). These APIs can be called to build a network model, and set graphs in the model, operators in the graphs, and attributes of the model and operators.
GPU	Graphics Processing Unit	Graphics processing unit (GPU) is a microprocessor that performs image and graphics computing on PCs, workstations, game consoles, and mobile devices such as tablets and smartphones.
H
HCCL	Huawei Collective Communication Library	Huawei Collective Communication Library (HCCL) provides high-performance collective communication between servers for training in deep learning.
HCCS	High Confidence Computing Systems	High Confidence Computing Systems (HCCS) provide the high-performance inter-device data communication in the multi-device scenario.
HPO	Hyperparameter Optimization	Hyperparameter optimization (HPO) means using automatic algorithms to optimize hyperparameters, such as the learning rate, activation function, and optimizer, that cannot be optimized through training in the original machine learning or deep learning algorithm.
HWTS	Hardware Task Scheduler	A hardware task scheduler (HWTS) schedules the hardware of AI Core tasks and reduces the scheduling latency.
I
IR	Intermediate Representation	An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive for further processing, such as optimization and translation.
J
JDK	Java Software Development Kit	Java software development kit (JDK) is a collection of Java-based software development tools.
K
KLD	Kullback-Leibler Divergence	The value of Kullback-Leibler divergence (KLD) ranges from 0 to infinity. The smaller the KLD, the closer the approximate distribution is to the true distribution.
L
L2 Cache	Level 2 Cache	Level 2 cache (L2 cache) is a shared second level cache that is called before the memory is accessed.
LLC	Last Level Cache	The last level cache (LLC) refers to the shared highest-level cache, which is called before the memory is accessed.
M
msproftx	msprof Tool Extension	msprof tool extension (msproftx) is an extension to the MindStudio system tuning tool.
MTE1	Memory Transfer Engine 1	Memory transfer engine 1 (MTE1) copies memory from the L1 Buffer.
MTE2	Memory Transfer Engine 2	Memory transfer engine 2 (MTE2) copies memory from the DDR or L2 Buffer.
MTE3	Memory Transfer Engine 3	Memory transfer engine 3 (MTE3) copies memory from the UB.
N
NAS	Neural Architecture Search	Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN). NAS has been used to design networks that are on par or outperform hand-designed architectures. It can effectively reduce the cost of using and implementing neural networks.
NIC	Network Interface Controller	Network interface controller (NIC) is also known as network interface card, network adapter, LAN adapter, and other similar terms. It refers to a hardware component that connects a computer to a computer network.
NPU	Neural-Network Processing Unit	A neural-network processing unit (NPU) uses the data-driven parallel computing architecture and is capable of efficiently processing massive video and image multimedia data. It is dedicated to processing a large number of computing tasks in artificial intelligence applications.
O
OP	Operator	Such as ReLU, Conv, Pooling, Scale, and Softmax.
OPP	Operator Package	-
OS	Operating System	-
P
PCIe	Peripheral Component Interconnect Express	Peripheral Component Interconnect Express (PCIe) is a high-speed serial point-to-point dual-channel high-bandwidth transmission technology. The connected devices are allocated with exclusive channel and do not share the bus bandwidth. It supports proactive power management, error reporting, peer-to-peer reliable transmission, hot swap, and quality of service (QoS).
PctRlt	Percent Result	-
PctThd	Percent Threshold	-
R
RateDiff	Rate Difference	-
RoCE	RDMA over Converged Ethernet	Remote Direct Memory Access (RDMA) provides remote memory management and allows the application memory on different servers to directly move data without CPU intervention. RoCE is a network protocol that provides communication interface bandwidth data.
Runtime	-	Runtime runs in the application process space and provides applications with functions (specific to Ascend AI Processors) for managing memory, device, stream, and events, and executing kernels.
S
Sample-based	-	Profiling samples profile data at fixed AI Core-sampling intervals.
SDK	Software Development Kit	A software development kit (SDK) is typically a set of software development tools that allows the creation of applications for a certain software package, software framework, hardware platform, operating system, or similar development platform.
Step Trace	-	The step trace contains the start and end time of the forward propagation and backpropagation, gradient update, and data augmentation hangover.
T
Task-based	-	Profiling samples the profile data of AI Core based on tasks.
TBE	Tensor Boost Engine	Tensor Boost Engine (TBE) provides APIs for implementing operators using the Python language, and compiles and generates CCE operators.
Tensor	-	Tensor is the main data structure in TensorFlow programs. A tensor is N-dimensional (where N may be very large). A tensor often takes the form of a scalar, vector, or matrix. The elements of a tensor can include integer values, floating point values, or string values.
TIK	Tensor Iterator Kernel	Tensor Iterator Kernel (TIK) is a dynamic programming framework based on Python. Developers can call the APIs (TIK DSL) provided by TIK to create custom operators in Python. The TIK compiler compiles the TIK DSL into the binary file adaptive to the Ascend AI Processor.
TransData	-	TransData is a format conversion operator.
TS	Task Scheduler	The task scheduler (TS) is used to distribute different kernels to the AI CPU or AI Core for execution.
V
Vector	-	Vector operation.