fp16_t
The APIs described in this section are reserved and may be changed or deprecated in the future. They do not need your attention.
API Definition |
Description |
|---|---|
tagFp16(void) |
Indicates the default constructor of fp16_t, which does not contain any parameters. |
tagFp16(const T &value) |
Indicates the constructor of fp16_t, which has a parameter of any data type. |
tagFp16(const bfloat16& value) |
Indicates the constructor of fp16_t, which has a parameter of the bfloat16 type. |
tagFp16(const uint16_t &uiVal) |
Indicates the constructor of fp16_t, which has a parameter of the uint16_t type. |
tagFp16(const tagFp16 &fp) |
Indicates the constructor of fp16_t, which has a parameter of the fp16_t type (copy constructor). |
float() |
Overrides the cast operator to convert fp16_t to float (fp32). |
bfloat16() |
Overrides the cast operator to convert fp16_t to bfloat16. |
double() |
Overrides the cast operator to convert fp16_t to double (fp64). |
int8_t() |
Overrides the cast operator to convert fp16_t to int8_t. |
uint8_t() |
Overrides the cast operator to convert fp16_t to uint8_t. |
int16_t() |
Overrides the cast operator to convert fp16_t to int16_t. |
uint16_t() |
Overrides the cast operator to convert fp16_t to uint16_t. |
int32_t() |
Overrides the cast operator to convert fp16_t to int32_t. |
uint32_t() |
Overrides the cast operator to convert fp16_t to uint32_t. |
int64_t() |
Overrides the cast operator to convert fp16_t to int64_t. |
uint64_t() |
Overrides the cast operator to convert fp16_t to uint64_t. |
bool() |
Overrides the cast operator to convert fp16_t to bool. |
IsInf() |
Checks whether the fp16_t value is infinite. 1 indicates positive infinity, and -1 indicates negative infinity. In other cases, 0 is returned. |
toFloat() |
Converts fp16_t to float (fp32). |
toDouble() |
Converts fp16_t to double (fp64). |
toInt8() |
Converts fp16_t to int8_t. |
toUInt8() |
Converts fp16_t to uint8_t. |
toInt16() |
Converts fp16_t to int16_t. |
toUInt16() |
Converts fp16_t to uint16_t. |
toInt32() |
Converts fp16_t to int32_t. |
toUInt32() |
Converts fp16_t to uint32_t. |
ExtractFP16(const uint16_t &val, uint16_t *s, int16_t *e, uint16_t *m) |
Extracts the sign, exponent, and mantissa of the fp16_t object. |
ReverseMan(bool negative, T *man) |
Calculates the two's complement of the mantissa when the sign bit is negative. |
MinMan(const int16_t &ea, T *ma, const int16_t &eb, T *mb) |
Shifts the mantissa with an exponent smaller than another exponent right. |
RightShift(T man, int16_t shift) |
Shifts the mantissa bits right. |
GetManSum(int16_t ea, const T &ma, int16_t eb, const T &mb) |
Obtains the mantissa sum of two fp16_t numbers. The supported types (T) are uint16_t, uint32_t, and uint64_t. |
ManRoundToNearest(bool bit0, bool bit1, bool bitLeft, T man, uint16_t shift = 0) |
Rounds the mantissa of fp16_t or float to the nearest value. |
GetManBitLength(T man) |
Obtains the bit length of the mantissa of a floating point number. |
isnan(op::fp16_t value) |
Checks whether the value is not a number (NaN). |