LogSoftMax

Applicability

Product	Supported
Atlas A3 training products / Atlas A3 inference products	√
Atlas A2 training products / Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	x
Atlas inference product 's AI Core	√
Atlas inference product 's Vector Core	x
Atlas training products	x

Function

Performs LogSoftmax computation on the input tensor. Below is the formula.

$\text{[math]}$

For ease of understanding, the formula expressed through a Python script is as follows, where src is the source operand (input), and dst, sum, and max are the destination operands (output).

      
           def log_softmax(src):
    # Perform rowmax (taking the maximum value by row) processing along the last axis.
    max = np.max(src, axis=-1, keepdims=True)
    sub = src - max
    exp = np.exp(sub)
   # Perform rowsum (taking the sum by row) processing along the last axis.
    sum = np.sum(exp, axis=-1, keepdims=True)
    dst = exp / sum
    dst = np.log10(dst)
    return dst, max, sum

Principles

The following figure shows the internal algorithm diagram of the LogSoftMax high-level APIs by taking the input tensor of the float type, in ND format, and with shape [m, k] as an example.

Figure 1 Diagram of the LogSoftMax algorithm

The computation process is divided into the following steps, all of which are performed on vectors:

reducemax: Compute the maximum value of each row of input x to obtain [m, 1]. The computation result is saved to the temporary space temp.
broadcast: Pad the data [m, 1] in temp by data block. For example, for the float type, extend [m, 1] to [m, 8] and output max.
sub: Subtract max from all data of input x by row.
exp: Compute exp for all data after sub.
reducesum: Sum up each row of data after exp is performed to obtain [m, 1]. The computation result is saved to temp.
broadcast: Pad [m, 1] in temp by data block. For example, for the float type, extend [m, 1] to [m, 8] and output sum.
div: Divide all data generated after exp by sum at each row.
log: Perform log10 computation on all data after div by row and output y.

Prototype

      
           template <typename T, bool isReuseSource = false, bool isDataFormatNZ = false>
__aicore__ inline void LogSoftMax(const LocalTensor<T>& dst, const LocalTensor<T>& sum, const LocalTensor<T>& max, const LocalTensor<T>& src, const LocalTensor<uint8_t>& sharedTmpBuffer, const LogSoftMaxTiling& tiling, const SoftMaxShapeInfo& softmaxShapeInfo = {})

Due to the complex mathematical computation involved in the internal implementation of this API, extra temporary space is required to store intermediate variables generated during computation. The temporary space can be passed by developers through the sharedTmpBuffer input parameter. To obtain the size of the temporary space (BufferSize) to be reserved, use the API provided in LogSoftMax Tiling.

Parameters

**Table 1** Template parameters
Parameter	Description
T	Data type of the operand. For the Atlas A3 training products / Atlas A3 inference products , the supported data types are half and float. For the Atlas A2 training products / Atlas A2 inference products , the supported data types are half and float. For the Atlas inference product 's AI Core, the supported data types are half and float.
isReuseSource	Whether the source operand can be modified. This parameter is reserved. Pass the default value false.
isDataFormatNZ	Whether the source operand is in NZ format. The default value is false.

Table 2 API parameters

Parameter

Input/Output

Description

dst

Output

Destination operand.