Loading [MathJax]/extensions/tex2jax.js

Rethinking Evaluation Methods for Machine Unlearning

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Leon Wichert
  • Sandipan Sikdar

Organisationseinheiten

Details

OriginalspracheEnglisch
Titel des SammelwerksFindings of the Association for Computational Linguistics
UntertitelEMNLP 2024
Herausgeber/-innenYaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Seiten4727-4739
Seitenumfang13
ISBN (elektronisch)9798891761681
PublikationsstatusVeröffentlicht - 12 Nov. 2024
Veranstaltung2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Hybrid, Miami, USA / Vereinigte Staaten
Dauer: 12 Nov. 202416 Nov. 2024

Abstract

Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

ASJC Scopus Sachgebiete

Zitieren

Rethinking Evaluation Methods for Machine Unlearning. / Wichert, Leon; Sikdar, Sandipan.
Findings of the Association for Computational Linguistics: EMNLP 2024. Hrsg. / Yaser Al-Onaizan; Mohit Bansal; Yun-Nung Chen. 2024. S. 4727-4739.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Wichert, L & Sikdar, S 2024, Rethinking Evaluation Methods for Machine Unlearning. in Y Al-Onaizan, M Bansal & Y-N Chen (Hrsg.), Findings of the Association for Computational Linguistics: EMNLP 2024. S. 4727-4739, 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Hybrid, Miami, USA / Vereinigte Staaten, 12 Nov. 2024. https://doi.org/10.18653/v1/2024.findings-emnlp.271
Wichert, L., & Sikdar, S. (2024). Rethinking Evaluation Methods for Machine Unlearning. In Y. Al-Onaizan, M. Bansal, & Y.-N. Chen (Hrsg.), Findings of the Association for Computational Linguistics: EMNLP 2024 (S. 4727-4739) https://doi.org/10.18653/v1/2024.findings-emnlp.271
Wichert L, Sikdar S. Rethinking Evaluation Methods for Machine Unlearning. in Al-Onaizan Y, Bansal M, Chen YN, Hrsg., Findings of the Association for Computational Linguistics: EMNLP 2024. 2024. S. 4727-4739 doi: 10.18653/v1/2024.findings-emnlp.271
Wichert, Leon ; Sikdar, Sandipan. / Rethinking Evaluation Methods for Machine Unlearning. Findings of the Association for Computational Linguistics: EMNLP 2024. Hrsg. / Yaser Al-Onaizan ; Mohit Bansal ; Yun-Nung Chen. 2024. S. 4727-4739
Download
@inproceedings{a7d13687cdba42899327e26944099c2a,
title = "Rethinking Evaluation Methods for Machine Unlearning",
abstract = "Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.",
author = "Leon Wichert and Sandipan Sikdar",
note = "Publisher Copyright: {\textcopyright} 2024 Association for Computational Linguistics.; 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 ; Conference date: 12-11-2024 Through 16-11-2024",
year = "2024",
month = nov,
day = "12",
doi = "10.18653/v1/2024.findings-emnlp.271",
language = "English",
pages = "4727--4739",
editor = "Yaser Al-Onaizan and Mohit Bansal and Yun-Nung Chen",
booktitle = "Findings of the Association for Computational Linguistics",

}

Download

TY - GEN

T1 - Rethinking Evaluation Methods for Machine Unlearning

AU - Wichert, Leon

AU - Sikdar, Sandipan

N1 - Publisher Copyright: © 2024 Association for Computational Linguistics.

PY - 2024/11/12

Y1 - 2024/11/12

N2 - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

AB - Machine *unlearning* refers to methods for deleting information about specific training instances from a trained machine learning model. This enables models to delete user information and comply with privacy regulations. While retraining the model from scratch on the training set excluding the instances to be “forgotten” would result in a desired unlearned model, owing to the size of datasets and models, it is infeasible. Hence, unlearning algorithms have been developed, where the goal is to obtain an unlearned model that behaves as closely as possible to the retrained model. Consequently, evaluating an unlearning method involves - (i) randomly selecting a forget set (i.e., the training instances to be unlearned), (ii) obtaining an unlearned and a retrained model, and (iii) comparing the performance of the unlearned and the retrained model on the test and forget set. However, when the forget set is randomly selected, the unlearned model is almost often similar to the original (i.e., prior to unlearning) model. Hence, it is unclear if the model did really unlearn or simply copied the weights from the original model. For a more robust evaluation, we instead propose to consider training instances with significant influence on the trained model. When such influential instances are considered in the forget set, we observe that the unlearned model deviates significantly from the retrained model. Such deviations are also observed when the size of the forget set is increased. Lastly, choice of dataset for evaluation could also lead to misleading interpretation of results.

UR - http://www.scopus.com/inward/record.url?scp=85217615154&partnerID=8YFLogxK

U2 - 10.18653/v1/2024.findings-emnlp.271

DO - 10.18653/v1/2024.findings-emnlp.271

M3 - Conference contribution

AN - SCOPUS:85217615154

SP - 4727

EP - 4739

BT - Findings of the Association for Computational Linguistics

A2 - Al-Onaizan, Yaser

A2 - Bansal, Mohit

A2 - Chen, Yun-Nung

T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024

Y2 - 12 November 2024 through 16 November 2024

ER -