Abstract
We introduce a family of pruning algorithms that sparsifies the parameters of a trained model in a way that approximately preserves the model’s predictive accuracy. Our algorithms use a small batch of input points to construct a data-informed importance sampling distribution over the network’s parameters and use either a sampling-based or a deterministic pruning procedure, or an adaptive mixture of both, to discard redundant weights. Our methods are simultaneously computationally efficient, provably accurate, and broadly applicable to various network architectures and data distributions. The presented approaches are simple to implement and can be easily integrated into standard prune-retrain pipelines. We present empirical comparisons showing that our algorithms reliably generate highly compressed networks that incur minimal loss in performance, regardless of whether the original network is fully trained or randomly initialized.
| Original language | English |
|---|---|
| Pages (from-to) | 26-45 |
| Number of pages | 20 |
| Journal | SIAM Journal on Mathematics of Data Science |
| Volume | 4 |
| Issue number | 1 |
| DOIs | |
| State | Published - 2022 |
Bibliographical note
Publisher Copyright:© 2022 Society for Industrial and Applied Mathematics.
Keywords
- compression
- generalization
- pruning
ASJC Scopus subject areas
- Applied Mathematics
- Computational Mathematics
- Statistics and Probability