Fair-Capacitated Clustering

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Tai Le Quy
  • Arjun Roy
  • Gunnar Friege
  • Eirini Ntoutsi

Externe Organisationen

  • Freie Universität Berlin (FU Berlin)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of The 14th International Conference on Educational Data Mining (EDM 2021)
Herausgeber/-innenI-Han Hsiao, Shaghayegh Sahebi, Francois Bouchet, Jill-Jenn Vie
Seiten407-414
PublikationsstatusVeröffentlicht - 2021
Veranstaltung14th International Conference on Educational Data Mining 2021 - Paris, Frankreich
Dauer: 29 Juni 20212 Juli 2021
Konferenznummer: 14

Abstract

Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

Zitieren

Fair-Capacitated Clustering. / Quy, Tai Le; Roy, Arjun; Friege, Gunnar et al.
Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). Hrsg. / I-Han Hsiao; Shaghayegh Sahebi; Francois Bouchet; Jill-Jenn Vie . 2021. S. 407-414.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Quy, TL, Roy, A, Friege, G & Ntoutsi, E 2021, Fair-Capacitated Clustering. in I-H Hsiao, S Sahebi, F Bouchet & J-J Vie (Hrsg.), Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). S. 407-414, 14th International Conference on Educational Data Mining 2021, Paris, Frankreich, 29 Juni 2021. https://doi.org/10.48550/arXiv.2104.12116
Quy, T. L., Roy, A., Friege, G., & Ntoutsi, E. (2021). Fair-Capacitated Clustering. In I.-H. Hsiao, S. Sahebi, F. Bouchet, & J.-J. Vie (Hrsg.), Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021) (S. 407-414) https://doi.org/10.48550/arXiv.2104.12116
Quy TL, Roy A, Friege G, Ntoutsi E. Fair-Capacitated Clustering. in Hsiao IH, Sahebi S, Bouchet F, Vie JJ, Hrsg., Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). 2021. S. 407-414 doi: 10.48550/arXiv.2104.12116
Quy, Tai Le ; Roy, Arjun ; Friege, Gunnar et al. / Fair-Capacitated Clustering. Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). Hrsg. / I-Han Hsiao ; Shaghayegh Sahebi ; Francois Bouchet ; Jill-Jenn Vie . 2021. S. 407-414
Download
@inproceedings{2409bf525208449ab36719189e9ff374,
title = "Fair-Capacitated Clustering",
abstract = " Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality. ",
keywords = "cs.LG, cs.DC",
author = "Quy, {Tai Le} and Arjun Roy and Gunnar Friege and Eirini Ntoutsi",
year = "2021",
doi = "10.48550/arXiv.2104.12116",
language = "English",
pages = "407--414",
editor = "I-Han Hsiao and Sahebi, {Shaghayegh } and Francois Bouchet and {Vie }, Jill-Jenn",
booktitle = "Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021)",
note = "14th International Conference on Educational Data Mining 2021 ; Conference date: 29-06-2021 Through 02-07-2021",

}

Download

TY - GEN

T1 - Fair-Capacitated Clustering

AU - Quy, Tai Le

AU - Roy, Arjun

AU - Friege, Gunnar

AU - Ntoutsi, Eirini

N1 - Conference code: 14

PY - 2021

Y1 - 2021

N2 - Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

AB - Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

KW - cs.LG

KW - cs.DC

U2 - 10.48550/arXiv.2104.12116

DO - 10.48550/arXiv.2104.12116

M3 - Conference contribution

SP - 407

EP - 414

BT - Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021)

A2 - Hsiao, I-Han

A2 - Sahebi, Shaghayegh

A2 - Bouchet, Francois

A2 - Vie , Jill-Jenn

T2 - 14th International Conference on Educational Data Mining 2021

Y2 - 29 June 2021 through 2 July 2021

ER -