Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Findings of the Association for Computational Linguistics |
Untertitel | EMNLP 2024 |
Herausgeber/-innen | Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen |
Seiten | 4727-4739 |
Seitenumfang | 13 |
ISBN (elektronisch) | 9798891761681 |
Publikationsstatus | Veröffentlicht - 12 Nov. 2024 |
Veranstaltung | 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Hybrid, Miami, USA / Vereinigte Staaten Dauer: 12 Nov. 2024 → 16 Nov. 2024 |
Abstract
Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Theoretische Informatik und Mathematik
- Informatik (insg.)
- Angewandte Informatik
- Informatik (insg.)
- Information systems
- Sozialwissenschaften (insg.)
- Linguistik und Sprache
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Findings of the Association for Computational Linguistics: EMNLP 2024. Hrsg. / Yaser Al-Onaizan; Mohit Bansal; Yun-Nung Chen. 2024. S. 4727-4739.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Rethinking Evaluation Methods for Machine Unlearning
AU - Wichert, Leon
AU - Sikdar, Sandipan
N1 - Publisher Copyright: © 2024 Association for Computational Linguistics.
PY - 2024/11/12
Y1 - 2024/11/12
N2 - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.
AB - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.
UR - http://www.scopus.com/inward/record.url?scp=85217615154&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.findings-emnlp.271
DO - 10.18653/v1/2024.findings-emnlp.271
M3 - Conference contribution
AN - SCOPUS:85217615154
SP - 4727
EP - 4739
BT - Findings of the Association for Computational Linguistics
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -