Instruction Accuracy Optimization

Instruction accuracy optimization concerns only the TBE DSL development mode.

You can alter the formula to avoid calling APIs with low accuracy. This section uses the vexp instruction and the besseli0e operator as an example to illustrate this idea. The besseli0e function is an even function, as shown in the following figure.

Figure 1 Diagram of the besseli0e function

Therefore, only the (0, +∞) range needs to be considered. Two formulas can be used for fitting in the (0, 3.75) and (3.75, +∞) ranges with piecewise-defined functions. The following uses the formula in the (0, 3.75) range as an example.

Given coefficient

For and , the formula of the besseli0e operator is as follows:

According to the test result, the relative error of some points is greater than 1‰. Use Octave to analyze the test data in a visualized manner and locate problems. The following figure shows all the points whose precision does not meet the requirement in the test data.

Figure 2 Visualized test result

Obviously, the points with relative error greater than 1‰ are located near the 0 and 3.5 boundaries, that is, ranges (0, 0.2) and (3.35, 3.75). After the relative errors of all points in the domain are drawn for further analysis, it is found that the ranges of the relative errors close to 1‰ are (0.7, 1) and (2.7, 3), which are inconsistent with the actual ranges. Therefore, the accuracy problem is not caused by the formula itself.

Then, analyze the ranges (0, 0.2) and (3.35, 3.75). It is found that the actual compute process results of most x-coordinates are small and the abnormal points whose relative error is greater than 1‰ are distributed in the entire ranges, as shown in the following figure.

Figure 3 Visualized analysis of the (0, 0.2) range

Figure 4 Visualized analysis of the (3.35, 3.75) range

Through visualized analysis, we can reach the following preliminary conclusions:

The points with high relative error are located near in the ranges (0, 0.2) and (3.35, 3.75)
The actual values of besseli0e are generally small, and the abnormal values are small.
The points with high relative error are abnormal points, which are irrelevant to the fitting formula of besseli0e.

Based on the preliminary conclusion, the accuracy problem is caused by the vexp instruction. However, we still need to analyze the intermediate result of the compute process to locate the specific step. Specifically, return the intermediate operator result and modify the comparison data generated using NumPy, perform ST to verify the correctness of each step until the specific compute process step is found. Intermediate error can accumulate or attenuate as the compute process proceeds. For the besseli0e operator, pay attention to the points that yields small compute process result with relative error greater than 0.08%. The following figures show the relative error distribution before and after e^x/e^x with the vexp instruction.

Figure 5 Relative error of e^x after vexp compute process

Figure 6 Relative error of e^x/e^x after vexp compute process

Now, it can be determined that the relative error is caused by the vexp instruction. Because the compute process result of the vexp instruction is relatively large, the compute process result of besseli0e is relatively small, and error of some points in the range (0, 0.2) and (3.35, 3.75) is about 1‰.

For the accuracy problem caused by a single vexp instruction, we can consider fitting e^x by Taylor series expansion. Taylor series expansion at x = 0 uses a simple formula. Perform Taylor series expansion on the low accuracy ranges (0, 0.2) and (3.35, 3.75) to test the accuracy after expansion, as shown in the following figures.

Obviously, the accuracy outside the range (–2, +2) does not meet the requirement. The range (3.35, 3.75) cannot be expanded using the Taylor series. Further analysis on the range (0, 0.2) shows that the accuracy is within 1‰. Add the e^x implemented by Taylor series expansion to the besseli0e operator for verification. The relative error is shown in the following figure.

The accuracy of (0, 0.2) meets the requirement. Because the absolute values of the range (–0.2, 0) are found before compute process, the input in the range of (3.35, 3.75) can be mapped to the range of (–0.2, +0.2) for compute process. Based on the property of the e^x function, select a proper formula for derivation, for example:

x = Qln2 + v, v ∈ (–0.2, +0.2)

Q = 5.1216 is obtained. Map x to v, and e^v is obtained. Since e^x=2^Qe^v: So far, we have obtained the e^x formula of the range (0, 3.75).

Select the corresponding formula to implement e^x in three ranges. ST verifies that e^x implemented by Taylor series expansion meets the accuracy requirement. Add to besseli0e to verify that the operator accuracy, and the result is also as expected.

The expansion step increases the build and execution time and therefore reduces the performance. You are advised to avoid expansion as long as the accuracy meets the requirements. For the unfulfilled range, using Taylor series expansion can only guarantee that the accuracy of the short ranges near the expanded point. Therefore, the target range must be determined before performing formula deduction. Map the computations of the unfulfillment range to the computations of the fulfillment range. For details, see Method 1: Range Mapping.

Parent topic: Accuracy Optimization Solutions