HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Patrick Glandorf; Timo Kaiser; Bodo Rosenhahn

doi:10.48550/arXiv.2308.07163

Details

Originalsprache	Englisch
Titel des Sammelwerks	2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Herausgeber (Verlag)	Institute of Electrical and Electronics Engineers Inc.
Seiten	1226-1235
Seitenumfang	10
ISBN (elektronisch)	9798350307443
ISBN (Print)	9798350307450
Publikationsstatus	Veröffentlicht - 2023
Veranstaltung	2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023 - Paris, Frankreich Dauer: 2 Okt. 2023 → 6 Okt. 2023

Abstract

Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.¹

ASJC Scopus Sachgebiete

Informatik (insg.)
Artificial intelligence
Informatik (insg.)
Angewandte Informatik
Informatik (insg.)
Maschinelles Sehen und Mustererkennung

Zitieren

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. / Glandorf, Patrick; Kaiser, Timo; Rosenhahn, Bodo.
2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., 2023. S. 1226-1235.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Glandorf, P, Kaiser, T & Rosenhahn, B 2023, HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., S. 1226-1235, 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023, Paris, Frankreich, 2 Okt. 2023. https://doi.org/10.48550/arXiv.2308.07163, https://doi.org/10.1109/ICCVW60793.2023.00133

Glandorf, P., Kaiser, T., & Rosenhahn, B. (2023). HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (S. 1226-1235). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2308.07163, https://doi.org/10.1109/ICCVW60793.2023.00133

Glandorf P, Kaiser T, Rosenhahn B. HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc. 2023. S. 1226-1235 doi: 10.48550/arXiv.2308.07163, 10.1109/ICCVW60793.2023.00133

Glandorf, Patrick ; Kaiser, Timo ; Rosenhahn, Bodo. / HyperSparse Neural Networks : Shifting Exploration to Exploitation through Adaptive Regularization. 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., 2023. S. 1226-1235

Download

@inproceedings{02264f81a4854e268c6daafe8fb8ba90,

title = "HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization",

abstract = "Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model {"}knowledge{"}into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1",

keywords = "Neural Networks, Pruning, Sparsity, Unstructured Pruning",

author = "Patrick Glandorf and Timo Kaiser and Bodo Rosenhahn",

note = "Funding Information: This work was supported by the Federal Ministry of Education and Research (BMBF), Germany under the project AI service center KISSKI (grant no. 01IS22093C), the Deutsche Forschungsgemeinschaft (DFG) under Germany's Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122), and by the Federal Ministry of the Environment, Nature Conservation, Nuclear Safety and Consumer Protection, Germany under the project GreenAutoML4FAS (grant no. 67KI32007A). ; 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023 ; Conference date: 02-10-2023 Through 06-10-2023",

year = "2023",

doi = "10.48550/arXiv.2308.07163",

language = "English",

isbn = "9798350307450",

pages = "1226--1235",

booktitle = "2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

Download

TY - GEN

T1 - HyperSparse Neural Networks

T2 - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023

AU - Glandorf, Patrick

AU - Kaiser, Timo

AU - Rosenhahn, Bodo

N1 - Funding Information: This work was supported by the Federal Ministry of Education and Research (BMBF), Germany under the project AI service center KISSKI (grant no. 01IS22093C), the Deutsche Forschungsgemeinschaft (DFG) under Germany's Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122), and by the Federal Ministry of the Environment, Nature Conservation, Nuclear Safety and Consumer Protection, Germany under the project GreenAutoML4FAS (grant no. 67KI32007A).

PY - 2023

Y1 - 2023

N2 - Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1

AB - Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1

KW - Neural Networks

KW - Pruning

KW - Sparsity

KW - Unstructured Pruning

UR - http://www.scopus.com/inward/record.url?scp=85180564637&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2308.07163

DO - 10.48550/arXiv.2308.07163

M3 - Conference contribution

AN - SCOPUS:85180564637

SN - 9798350307450

SP - 1226

EP - 1235

BT - 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 2 October 2023 through 6 October 2023

ER -

Research@Leibniz University

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Autorschaft

Organisationseinheiten

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction