Previous work showed empirically that large neural networks can be significantly reduced in size while preserving their accuracy. Model compression became a central research topic, as it is crucial for deployment of neural networks on devices with limited computational and memory resources. The majority of the compression methods are based on heuristics and offer no worst-case guarantees on the trade-off between the compression rate and the approximation error for an arbitrarily new sample. We propose the first efficient, data-independent neural pruning algorithm with a provable trade-off between its compression rate and the approximation error for any future test sample. Our method is based on the coreset framework, which finds a small weighted subset of points that provably approximates the original inputs. Specifically, we approximate the output of a layer of neurons by a coreset of neurons in the previous layer and discard the rest. We apply this framework in a layer-by-layer fashion from the top to the bottom. Unlike previous works, our coreset is data independent, meaning that it provably guarantees the accuracy of the function for any input x ∈ Rd, including an adversarial one. We demonstrate the effectiveness of our method on popular network architectures. In particular, our coresets yield 90% compression of the LeNet-300-100 architecture on MNIST while improving classification accuracy.
|Published - 2020
|8th International Conference on Learning Representations, ICLR 2020 - Addis Ababa, Ethiopia
Duration: 30 Apr 2020 → …
|8th International Conference on Learning Representations, ICLR 2020
|30/04/20 → …
Bibliographical notePublisher Copyright:
© 2020 8th International Conference on Learning Representations, ICLR 2020. All rights reserved.
ASJC Scopus subject areas
- Linguistics and Language
- Language and Linguistics
- Computer Science Applications