2:4 Structured Sparsity Algorithm

AMCT uses the 2:4 structured sparsity algorithm L1SelectivePrune to determine the weights to be reserved by comparing the l1 values (absolute values) of the weights. Among every four consecutive weights, the two weights with the largest l1 values are reserved.

The L1SelectivePruner field in the sparsity configuration file controls the L1SelectivePrune algorithm. PyTorch is used as an example. For details about the field, see Simplified QAT Configuration File.

  • update_freq: Indicates the interval for updating 2:4 sparsity, for calculating which elements are reserved. In a retraining process, weights change with each training batch, leading to possible sorting change of the corresponding l1 values. For example, the first two elements in the four elements are originally reserved, while the first and third elements may be reserved after the update.

    If update_freq is set to 0, elements to be reserved are calculated only in the first batch during training. If update_freq is set to 2, elements to be reserved are calculated in every two batches during training. The rest may be deduced by analogy. The default value is 0.

  • n_out_of_m_type: Currently, only M4N2 is supported. That is, two weights in every four consecutive weights are reserved.