The Compensating Bias

Neural network pruning removes weights to make models smaller. The standard approach: identify weights with minimal impact and remove them. The network degrades slightly; retraining recovers some of the loss. The measure of impact — magnitude, gradient, sensitivity — determines which weights to cut.

Ballini and colleagues (arXiv:2602.20467) observe that biases don't count toward sparsity. A network with 90% of its weights removed but all biases intact is 90% sparse. The biases are free parameters in the sparsity budget. So instead of just removing weights, adjust the biases to compensate for the removed weights. The compensation is computed using automatic differentiation — for each candidate weight removal, the optimal bias perturbation that minimizes the resulting network change can be calculated efficiently.

This changes which weights are “important.” Under standard pruning, a weight is important if its removal changes the network output significantly. Under elimination-compensation pruning, a weight is important if its removal changes the network output significantly after the biases have been optimally adjusted. Weights that seemed important become dispensable when the biases absorb their function.

The bias adjustments can be applied independently after weight removal — they don't interact with each other in ways that require joint optimization. This makes the method practical: compute the compensation for each weight independently, remove the least-important weights (after compensation), apply the bias adjustments.

The general observation: when a system has two types of parameters — one constrained (weights under sparsity) and one free (biases) — removing elements of the constrained type while adjusting the free type can preserve function more effectively than removing elements alone. The free parameters absorb the cost of the constrained ones. Importance is conditional on what can compensate.