Data Type Casting

Overview

Considering float16's inherent reduced precision (click here for more details), it is advisable to cast float16 to float32 for operator implementation, which is conducive to the compute result accuracy, especially for those complex computations.

Take the implementation of res = x * y as an example.

Given input tensors x and y of type float16, cast them to the float32 type before computation and cast the result data res back to the original data type float16 after computation. The following is the sample operator code:

dtype = data_x.dtype
if dtype == "float16":
    data_x = tbe.dsl.cast_to(data_x, "float32")
    data_y = tbe.dsl.cast_to(data_y, "float32")
    res = tbe.dsl.vmul(data_x, data_y)
if dtype == "float16":
    res = tbe.dsl.cast_to(res, "float16")

In accuracy testing, the data also needs to be converted into the float32 type.

Restrictions

Pay attention to the following points when converting data types:

The data type must be converted to the original data type for output.
The conversion from float16 to float32 deteriorates the running performance. Therefore, try not to change the data type if the computation precision of the float16 data type is within the allowed range.

Parent topic: Accuracy Optimization Solutions