Enabling Deterministic Computing

When training is performed on a GPU or NPU, the results of multiple executions may be different. The difference is generally caused by asynchronous multi-thread execution in the operator implementation, which may change the order of floating-point number accumulation. Deterministic computing can be enabled on the NPU to ensure that the results of multiple executions are the same and improve the accuracy of precision comparison. However, the operator execution time will be prolonged, resulting in performance deterioration. You can determine whether to enable deterministic computing based on the actual situation.

  • Training configuration example in session.run mode:
    custom_op.parameter_map["deterministic"].i = 1

    For details, see the "Session Configuration Parameters" in TF Adapter APIs (1.x).

  • Training configuration example in Estimator mode:
    config = NPURunConfig(deterministic=1)

    For details, see the "NPURunConfig Configuration Parameters" in TF Adapter APIs (1.x).