Selecting Bad Cases

A bad case is an input (prompt) where there is a difference between the model's inference result and the benchmark model's result. A bad case is the prerequisite for analyzing and diagnosing LLM accuracy issues. For identifying and selecting appropriate bad cases, refer to How to Identify a Bad Case.

For example, on Model A, if a question related to generating code is input, and the external device provides a normal response, but the Ascend platform frequently returns null, then this input is considered a bad case. We can use this input as the model's inference prompt to further troubleshoot the issue.