APIs

TF Adapter provides APIs for users to develop training or online inference scripts based on the deep learning framework TensorFlow 2.6.5.

Figure 1 TF Adapter

API path: {install_path}/python/site-packages/npu_device

Table 1 TF Adapter APIs

API

Description

npu.open

Registers an NPU device, used in conjunction with as_default to set the NPU as the default device.

npu.global_options

Returns a global singleton configuration object for initializing an NPU device. By modifying the options of the global singleton object, you can control the initialization options of the NPU device. This API must be called before the npu.open API call.

npu.distribute.all_reduce

Performs aggregation operation between workers in distributed NPU training.

npu.distribute.broadcast

Synchronizes variables between workers in distributed NPU training.

npu.distribute.npu_distributed_keras_optimizer_wrapper

Adds the AllReduce operation of the NPU to aggregate the gradients, and then updates the gradients. This API applies only to distributed training.

npu.distribute.shard_and_rebatch_dataset

Shards the dataset and global batch size for workers in distributed NPU training.

npu.keep_dtype_scope

Specifies the operators that preserve the original precision. If the operator precision in an original network model is not supported by the Ascend AI Processor, the system automatically uses the high precision supported by the operators for compute.

npu.set_npu_loop_size

Sets the number of iterations (or steps) per loop offloaded to the NPU.

npu.train.optimizer.NpuLossScaleOptimizer

When the overflow/underflow mode of floating-point computation is saturation mode, the overflow/underflow computation on the NPU may not output Inf or NaN. Therefore, you should replace LossScaleOptimizer in the script with NpuLossScaleOptimizer, to mask the differences in overflow/underflow detection.

npu.ops.gelu

Computes the Gaussian Error Linear Unit (GELU) activation function. Each input tensor is multiplied by one P(X <= x), where P(X) follows N (0, 1).

set_device_sat_mode

Sets the process-level overflow mode for floating-point compute. Two overflow modes are supported: saturation mode and Inf/NaN mode.
  • Saturation mode: When overflow occurs during compute, the compute result is saturated as the floating-point extremum (+-MAX).
  • INF/NaN mode: Complies with IEEE 754 and outputs the INF/NAN compute result based on the definition.