Rethinking Evaluation Methods for Machine Unlearning

Leon Wichert; Sandipan Sikdar

doi:10.18653/v1/2024.findings-emnlp.271

Details

Original language	English
Title of host publication	Findings of the Association for Computational Linguistics
Subtitle of host publication	EMNLP 2024
Editors	Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Publisher	Association for Computational Linguistics (ACL)
Pages	4727-4739
Number of pages	13
ISBN (electronic)	9798891761681
Publication status	Published - 12 Nov 2024
Event	2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Hybrid, Miami, United States Duration: 12 Nov 2024 → 16 Nov 2024

Abstract

Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

ASJC Scopus subject areas

Computer Science(all)
Computational Theory and Mathematics
Computer Science(all)
Computer Science Applications
Computer Science(all)
Information Systems
Social Sciences(all)
Linguistics and Language

Cite this

Rethinking Evaluation Methods for Machine Unlearning. / Wichert, Leon; Sikdar, Sandipan.
Findings of the Association for Computational Linguistics: EMNLP 2024. ed. / Yaser Al-Onaizan; Mohit Bansal; Yun-Nung Chen. Association for Computational Linguistics (ACL), 2024. p. 4727-4739.

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Wichert, L & Sikdar, S 2024, Rethinking Evaluation Methods for Machine Unlearning. in Y Al-Onaizan, M Bansal & Y-N Chen (eds), Findings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics (ACL), pp. 4727-4739, 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Hybrid, Miami, United States, 12 Nov 2024. https://doi.org/10.18653/v1/2024.findings-emnlp.271

Wichert, L., & Sikdar, S. (2024). Rethinking Evaluation Methods for Machine Unlearning. In Y. Al-Onaizan, M. Bansal, & Y.-N. Chen (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2024 (pp. 4727-4739). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2024.findings-emnlp.271

Wichert L, Sikdar S. Rethinking Evaluation Methods for Machine Unlearning. In Al-Onaizan Y, Bansal M, Chen YN, editors, Findings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics (ACL). 2024. p. 4727-4739 doi: 10.18653/v1/2024.findings-emnlp.271

Wichert, Leon ; Sikdar, Sandipan. / Rethinking Evaluation Methods for Machine Unlearning. Findings of the Association for Computational Linguistics: EMNLP 2024. editor / Yaser Al-Onaizan ; Mohit Bansal ; Yun-Nung Chen. Association for Computational Linguistics (ACL), 2024. pp. 4727-4739

Download

@inproceedings{a7d13687cdba42899327e26944099c2a,

title = "Rethinking Evaluation Methods for Machine Unlearning",

abstract = "Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.",

author = "Leon Wichert and Sandipan Sikdar",

note = "Publisher Copyright: {\textcopyright} 2024 Association for Computational Linguistics.; 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 ; Conference date: 12-11-2024 Through 16-11-2024",

year = "2024",

month = nov,

day = "12",

doi = "10.18653/v1/2024.findings-emnlp.271",

language = "English",

pages = "4727--4739",

editor = "Yaser Al-Onaizan and Mohit Bansal and Yun-Nung Chen",

booktitle = "Findings of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

address = "Australia",

}

Download

TY - GEN

T1 - Rethinking Evaluation Methods for Machine Unlearning

AU - Wichert, Leon

AU - Sikdar, Sandipan

PY - 2024/11/12

Y1 - 2024/11/12

N2 - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

AB - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

UR - http://www.scopus.com/inward/record.url?scp=85217615154&partnerID=8YFLogxK

U2 - 10.18653/v1/2024.findings-emnlp.271

DO - 10.18653/v1/2024.findings-emnlp.271

M3 - Conference contribution

AN - SCOPUS:85217615154

SP - 4727

EP - 4739

BT - Findings of the Association for Computational Linguistics

A2 - Al-Onaizan, Yaser

A2 - Bansal, Mohit

A2 - Chen, Yun-Nung

PB - Association for Computational Linguistics (ACL)

T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024

Y2 - 12 November 2024 through 16 November 2024

ER -

Research@Leibniz University

Rethinking Evaluation Methods for Machine Unlearning

Authors

Research Organisations

Details

Abstract

ASJC Scopus subject areas

Cite this