Fair-Capacitated Clustering

Tai Le Quy; Arjun Roy; Gunnar Friege; Eirini Ntoutsi

doi:10.48550/arXiv.2104.12116

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021)
Herausgeber/-innen	I-Han Hsiao, Shaghayegh Sahebi, Francois Bouchet, Jill-Jenn Vie
Seiten	407-414
Publikationsstatus	Veröffentlicht - 2021
Veranstaltung	14th International Conference on Educational Data Mining 2021 - Paris, Frankreich Dauer: 29 Juni 2021 → 2 Juli 2021 Konferenznummer: 14

Abstract

Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

Zitieren

Fair-Capacitated Clustering. / Quy, Tai Le; Roy, Arjun; Friege, Gunnar et al.
Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). Hrsg. / I-Han Hsiao; Shaghayegh Sahebi; Francois Bouchet; Jill-Jenn Vie . 2021. S. 407-414.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Quy, TL, Roy, A, Friege, G & Ntoutsi, E 2021, Fair-Capacitated Clustering. in I-H Hsiao, S Sahebi, F Bouchet & J-J Vie (Hrsg.), Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). S. 407-414, 14th International Conference on Educational Data Mining 2021, Paris, Frankreich, 29 Juni 2021. https://doi.org/10.48550/arXiv.2104.12116

Quy, T. L., Roy, A., Friege, G., & Ntoutsi, E. (2021). Fair-Capacitated Clustering. In I.-H. Hsiao, S. Sahebi, F. Bouchet, & J.-J. Vie (Hrsg.), Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021) (S. 407-414) https://doi.org/10.48550/arXiv.2104.12116

Quy TL, Roy A, Friege G, Ntoutsi E. Fair-Capacitated Clustering. in Hsiao IH, Sahebi S, Bouchet F, Vie JJ, Hrsg., Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). 2021. S. 407-414 doi: 10.48550/arXiv.2104.12116

Quy, Tai Le ; Roy, Arjun ; Friege, Gunnar et al. / Fair-Capacitated Clustering. Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021). Hrsg. / I-Han Hsiao ; Shaghayegh Sahebi ; Francois Bouchet ; Jill-Jenn Vie . 2021. S. 407-414

Download

@inproceedings{2409bf525208449ab36719189e9ff374,

title = "Fair-Capacitated Clustering",

abstract = " Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality. ",

keywords = "cs.LG, cs.DC",

author = "Quy, {Tai Le} and Arjun Roy and Gunnar Friege and Eirini Ntoutsi",

year = "2021",

doi = "10.48550/arXiv.2104.12116",

language = "English",

pages = "407--414",

editor = "I-Han Hsiao and Sahebi, {Shaghayegh } and Francois Bouchet and {Vie }, Jill-Jenn",

booktitle = "Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021)",

note = "14th International Conference on Educational Data Mining 2021 ; Conference date: 29-06-2021 Through 02-07-2021",

}

Download

TY - GEN

T1 - Fair-Capacitated Clustering

AU - Quy, Tai Le

AU - Roy, Arjun

AU - Friege, Gunnar

AU - Ntoutsi, Eirini

N1 - Conference code: 14

PY - 2021

Y1 - 2021

N2 - Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

AB - Traditionally, clustering algorithms focus on partitioning the data into groups of similar instances. The similarity objective, however, is not sufficient in applications where a fair-representation of the groups in terms of protected attributes like gender or race, is required for each cluster. Moreover, in many applications, to make the clusters useful for the end-user, a balanced cardinality among the clusters is required. Our motivation comes from the education domain where studies indicate that students might learn better in diverse student groups and of course groups of similar cardinality are more practical e.g., for group assignments. To this end, we introduce the fair-capacitated clustering problem that partitions the data into clusters of similar instances while ensuring cluster fairness and balancing cluster cardinalities. We propose a two-step solution to the problem: i) we rely on fairlets to generate minimal sets that satisfy the fair constraint and ii) we propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain the fair-capacitated clustering. The hierarchical approach embeds the additional cardinality requirements during the merging step while the partitioning-based one alters the assignment step using a knapsack problem formulation to satisfy the additional requirements. Our experiments on four educational datasets show that our approaches deliver well-balanced clusters in terms of both fairness and cardinality while maintaining a good clustering quality.

KW - cs.LG

KW - cs.DC

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85124270100&origin=inward&txGid=d311b1b95938fe8f2d065bd87ba9e712

U2 - 10.48550/arXiv.2104.12116

DO - 10.48550/arXiv.2104.12116

M3 - Conference contribution

SP - 407

EP - 414

BT - Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021)

A2 - Hsiao, I-Han

A2 - Sahebi, Shaghayegh

A2 - Bouchet, Francois

A2 - Vie , Jill-Jenn

T2 - 14th International Conference on Educational Data Mining 2021

Y2 - 29 June 2021 through 2 July 2021

ER -

Research@Leibniz University

Fair-Capacitated Clustering

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

Zitieren