Constant Validation Accuracy

Objective

A good benchmark model should make the same prediction despite how many times the training script is executed.

Unless otherwise specified, randomly-shuffled datasets are the prerequisite for training more than once.

Principle

If the benchmark model does not meet the standard, check if the following conditions that help avoid large differences between predictions are true:

  • The model has a stable algorithm.
  • The dataset quality is high.
  • The hyperparameters are stable.

Procedure

For a well-developed model (using hyperparameter borrowing):

  1. Check the hyperparameters in use and ensure that they are consistent with the given benchmarks.
  2. For cluster training, check that the cluster training mode is the same as the given benchmark.
  3. Check that dataset files are the same as the given benchmarks.
  4. Check the model code and parameters and ensure that the compute logic is consistent with the given benchmark.
  5. Check the computational graph and ensure that the computation process and operator shapes are consistent with the given benchmarks.
  6. Retrain the model and validate the accuracy of the retrained model. If the accuracy is still different from the benchmark accuracy, repeat the preceding steps until it reaches the benchmark accuracy.
  7. Perform training more than three times and check that the validation accuracy of each training is the same as the benchmark accuracy. Repeat the preceding steps until all the preceding conditions are met.

For user-defined hyperparameters for a well-developed model:

  1. Check that labels are correct in both information and format if the model is trained on a custom dataset.
  2. Use the dataset matched with the well-developed model directly (or tailoring it to your needs) if your dataset's quality is not guaranteed.
  3. Generate a set of candidate hyperparameters by tailoring the dataset matched with the well-developed model based on the training dataset in use and the cluster scale.
  4. Perform training more than three times and check that the validation accuracy is constant. Repeat the preceding steps until all the preceding conditions are met.

For a custom model:

  • Check that the dataset samples are correctly labeled.
  • Select a group of stable candidate hyperparameters after debugging.
  • Perform training more than three times and check that the validation accuracy is constant. Adjust the hyperparameters and model structure, and repeat the preceding steps until all the preceding conditions are met.