Glossary

Table 1 Glossary

Glossary

Full Form

Description

A

AI

Artificial Intelligence

Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans.

AIPP

Artificial Intelligence Pre-Processing

AI pre-processing (AIPP) implements AI Core–based image preprocessing including image resizing, color space conversion (CSC), and mean subtraction and factor multiplication (for pixel modification), prior to model inference.

Ascend EP

Ascend Endpoint

Ascend Endpoint (Ascend EP) are Ascend AI Processors that serve as secondary devices, for example, PCIe accelerator cards. They work with the primary devices (x86 or Arm servers) for purposes such as inference, training, and image recognition.

Ascend RC

Ascend Root Complex

Ascend Root Complex (Ascend RC) are Ascend AI Processors that serve as the primary devices, for example, the Atlas 200 DK. They provide the host control function and are mainly applicable to mobile devices.

AscendCL

Ascend Computing Language

Ascend Computing Language (AscendCL) provides a collection of C APIs for users to develop deep neural network (DNN) apps for target recognition and image classification, ranging from device, context, stream, and memory management, to model and operator loading and execution, as well as media data processing.

ASHA

Asynchronous Successive Halving Algorithm

Asynchronous Successive Halving Algorithm (ASHA) is a hyperparameter optimization algorithm based on dynamic resource allocation. The basic idea is to conduct massive parallel training of hyperparameters with fewer training iterations per epoch. It evaluates and ranks all hyperparameters, and applies early stopping to trainings for all hyperparameters ranked in the lower half. Then the next epoch of evaluation is performed on the remaining hyperparameters. The evaluation is again halved until the optimization goal is achieved.

ATC

Ascend Tensor Compiler

Ascend Tensor Compiler (ATC)

  • Converts network models under open-source frameworks, such as Caffe and TensorFlow, into offline models supported by Ascend AI Processors. Implements operator scheduling tuning, weight data rearrangement, and memory usage tuning during model conversion.
  • Supports operator build.

AutoML

Automated Machine Learning

Automated machine learning (AutoML) refers to a series of automation algorithms, including feature extraction, model selection, and parameter optimization. It automatically trains valuable models.

B

BOHB

Bayesian Optimization and Hyperband

Bayesian Optimization and Hyperband (BOHB) mixes the Hyperband algorithm and Bayesian optimization.

It uses the Hyperband capability to sample many configurations with a small budget to explore quickly and efficiently the hyperparameter search space and get promising configurations. Then it uses the Bayesian optimizer predictive power to propose good configurations close to the optimum.

BOSS

Bayesian Optimization via Sub-Sampling

Bayesian Optimization via Sub-Sampling (BOSS) is a general hyperparameter optimization algorithm based on the Bayesian optimization. It is used for efficient hyperparameter search under the restricted computing resources setting.

BP Point

Backpropagation Point

Backpropagation Point (BP Point) refers to the end position of an inverse operator in the iterative trajectory of a training network.

C

CPU

Central Processing Unit

Central processing unit (CPU) is one of the main parts of a computer apart from internal memory and input and output devices. It interprets computer instructions and processes data in computer software.

D

DDR

Double Data Rate

In computing, a computer bus operating with double data rate (DDR) transfers data on both the rising and falling edges of the clock signal.

DiffThd

Difference Threshold

-

DSL

Domain-Specific Language

Domain-Specific Language (DSL) is an operator development method. Users only need to use DSL APIs to express the computation process. Subsequent operator scheduling, optimization, and compilation can be completed by using existing APIs within a few clicks.

DVPP

Digital Vision Pre-Processing

Digital vision pre-processing (DVPP) provides operations such as decoding and scaling of videos and images in specific formats, and encodes and outputs processed videos and images.

F

FP Point

Forward Propagation Point

Forward propagation point (FP Point) refers to the start position of a forward operator in the iterative trajectory of a training network.

FpDiff

Floating-point Difference

-

G

GDB

GNU Debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems.

GE

Graph Engine

Graph Engine (GE) provides a set of secure and easy-to-use APIs for graph/operator intermediate representation (IR). These APIs can be called to build a network model, and set graphs in the model, operators in the graphs, and attributes of the model and operators.

GPU

Graphics Processing Unit

Graphics processing unit (GPU) is a microprocessor that performs image and graphics computing on PCs, workstations, game consoles, and mobile devices such as tablets and smartphones.

H

HCCL

Huawei Collective Communication Library

Huawei Collective Communication Library (HCCL) provides high-performance collective communication between servers for training in deep learning.

HCCS

High Confidence Computing Systems

High Confidence Computing Systems (HCCS) provide the high-performance inter-device data communication in the multi-device scenario.

HPO

Hyperparameter Optimization

Hyperparameter optimization (HPO) means using automatic algorithms to optimize hyperparameters, such as the learning rate, activation function, and optimizer, that cannot be optimized through training in the original machine learning or deep learning algorithm.

HWTS

Hardware Task Scheduler

A hardware task scheduler (HWTS) schedules the hardware of AI Core tasks and reduces the scheduling latency.

I

IR

Intermediate Representation

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive for further processing, such as optimization and translation.

J

JDK

Java Software Development Kit

Java software development kit (JDK) is a collection of Java-based software development tools.

K

KLD

Kullback-Leibler Divergence

The value of Kullback-Leibler divergence (KLD) ranges from 0 to infinity. The smaller the KLD, the closer the approximate distribution is to the true distribution.

L

L2 Cache

Level 2 Cache

Level 2 cache (L2 cache) is a shared second level cache that is called before the memory is accessed.

LLC

Last Level Cache

The last level cache (LLC) refers to the shared highest-level cache, which is called before the memory is accessed.

M

msproftx

msprof Tool Extension

msprof tool extension (msproftx) is an extension to the MindStudio system tuning tool.

MTE1

Memory Transfer Engine 1

Memory transfer engine 1 (MTE1) copies memory from the L1 Buffer.

MTE2

Memory Transfer Engine 2

Memory transfer engine 2 (MTE2) copies memory from the DDR or L2 Buffer.

MTE3

Memory Transfer Engine 3

Memory transfer engine 3 (MTE3) copies memory from the UB.

N

NAS

Neural Architecture Search

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN). NAS has been used to design networks that are on par or outperform hand-designed architectures. It can effectively reduce the cost of using and implementing neural networks.

NIC

Network Interface Controller

Network interface controller (NIC) is also known as network interface card, network adapter, LAN adapter, and other similar terms. It refers to a hardware component that connects a computer to a computer network.

NPU

Neural-Network Processing Unit

A neural-network processing unit (NPU) uses the data-driven parallel computing architecture and is capable of efficiently processing massive video and image multimedia data. It is dedicated to processing a large number of computing tasks in artificial intelligence applications.

O

OP

Operator

Such as ReLU, Conv, Pooling, Scale, and Softmax.

OPP

Operator Package

-

OS

Operating System

-

P

PCIe

Peripheral Component Interconnect Express

Peripheral Component Interconnect Express (PCIe) is a high-speed serial point-to-point dual-channel high-bandwidth transmission technology. The connected devices are allocated with exclusive channel and do not share the bus bandwidth. It supports proactive power management, error reporting, peer-to-peer reliable transmission, hot swap, and quality of service (QoS).

PctRlt

Percent Result

-

PctThd

Percent Threshold

-

R

RateDiff

Rate Difference

-

RoCE

RDMA over Converged Ethernet

Remote Direct Memory Access (RDMA) provides remote memory management and allows the application memory on different servers to directly move data without CPU intervention. RoCE is a network protocol that provides communication interface bandwidth data.

Runtime

-

Runtime runs in the application process space and provides applications with functions (specific to Ascend AI Processors) for managing memory, device, stream, and events, and executing kernels.

S

Sample-based

-

Profiling samples profile data at fixed AI Core-sampling intervals.

SDK

Software Development Kit

A software development kit (SDK) is typically a set of software development tools that allows the creation of applications for a certain software package, software framework, hardware platform, operating system, or similar development platform.

Step Trace

-

The step trace contains the start and end time of the forward propagation and backpropagation, gradient update, and data augmentation hangover.

T

Task-based

-

Profiling samples the profile data of AI Core based on tasks.

TBE

Tensor Boost Engine

Tensor Boost Engine (TBE) provides APIs for implementing operators using the Python language, and compiles and generates CCE operators.

Tensor

-

Tensor is the main data structure in TensorFlow programs. A tensor is N-dimensional (where N may be very large). A tensor often takes the form of a scalar, vector, or matrix. The elements of a tensor can include integer values, floating point values, or string values.

TIK

Tensor Iterator Kernel

Tensor Iterator Kernel (TIK) is a dynamic programming framework based on Python. Developers can call the APIs (TIK DSL) provided by TIK to create custom operators in Python. The TIK compiler compiles the TIK DSL into the binary file adaptive to the Ascend AI Processor.

TransData

-

TransData is a format conversion operator.

TS

Task Scheduler

The task scheduler (TS) is used to distribute different kernels to the AI CPU or AI Core for execution.

V

Vector

-

Vector operation.