DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks
Neural pruning is a widely-used compression technique for Deep Neural Networks (DNNs). Recent innovations in Hardware Architectures (e.g. Nvidia Ampere Sparse Tensor Core) and N:M fine-grained Sparse Neural Network algorithms (i.e. every M-weights contains N non-zero values) reveal a promising research line of neural pruning. However, the existing N:M algorithms only address the challenge of how to train N:M sparse neural networks in a uniform fashion (i.e. every layer has the same N:M sparsity) and suffer from a significant accuracy drop for high sparsity (i.e. when sparsity > 80%). To tackle this problem, we present a novel technique - DominoSearch to find mixed N:M sparsity schemes from pre-trained dense deep neural networks to achieve higher accuracy than the uniform-sparsity scheme with equivalent complexity constraints (e.g. model size or FLOPs). For instance, for the same model size with 2.1M parameters (87.5% sparsity), our layer-wise N:M sparse ResNet18 can outperform its uniform counterpart by 2.1% top-1 accuracy, on the large-scale ImageNet dataset. For the same computational complexity of 227M FLOPs, our layer-wise sparse ResNet18 can outperform the uniform one by 1.3% top-1 accuracy. Furthermore, our layer-wise fine-grained N:M sparse ResNet50 can achieve 76.7% top-1 accuracy with 5.0M parameters. This is competitive to 76.2% top-1 accuracy achieved by the state-of-the-art layer-wise unstructured-sparsity model with same number of parameters, which is believed to be the upper-bound of Neural Network pruning with respect to accuracy-sparsity trade-off. We believe that our work can build a strong baseline for further sparse DNN research and encourage future hardware-algorithm co-design work.
- DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks
W. Sun, A. Zhou, S. Stuijk, R. Wijnhoven, A. Nelson, H. Li, and H. Corporaal.
In Neural Information Processing Systems, NeurIPS 21, Proceedings, pages xyz-xyz. Virtual, 6-14 December, 2021. NeurIPS, 2021. (abstract, pdf, doi).