Troubleshooting Workflow

Inference Scenario

Figure 1 Troubleshooting workflow in inference scenario
  1. Encounter an AI Core error in the inference process.
  2. Configure the AI Core Error Analyzer parameters including op_debug_level and debug_dir, for locating the code line numbers of error operators.
  3. Convert the model again to generate an instruction mapping file.
  4. Analyze the error using the AI Core Error Analyzer.

Training Scenario

Figure 2 Troubleshooting workflow in training scenario
  1. Encounter an AI Core error in the model training process.
  2. Configure the AI Core Error Analyzer parameters, including:
    • op_debug_level for locating the code line numbers of error operators.
    • enable_exception_dump for dumping error operators.
  3. Run the model training script again to generate an instruction mapping file and a dump file of each error operator.
  4. Analyze the error using the AI Core Error Analyzer.