Supervised Contrastive Learning Approach for Contextual Ranking

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Abhijit Anand
  • Jurek Leonhardt
  • Koustav Rudra
  • Avishek Anand

Research Organisations

External Research Organisations

  • Indian School of Mines University
  • Delft University of Technology
View graph of relations

Details

Original languageEnglish
Title of host publicationICTIR 2022
Subtitle of host publicationProceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval
Pages61-71
Number of pages11
ISBN (electronic)9781450394123
Publication statusPublished - 25 Aug 2022
Event8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022 - Virtual, Online, Spain
Duration: 11 Jul 202212 Jul 2022

Abstract

Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).

Keywords

    data augmentation, document ranking, interpolation, ranking performance, supervised contrastive loss

ASJC Scopus subject areas

Cite this

Supervised Contrastive Learning Approach for Contextual Ranking. / Anand, Abhijit; Leonhardt, Jurek; Rudra, Koustav et al.
ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. 2022. p. 61-71.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Anand, A, Leonhardt, J, Rudra, K & Anand, A 2022, Supervised Contrastive Learning Approach for Contextual Ranking. in ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. pp. 61-71, 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022, Virtual, Online, Spain, 11 Jul 2022. https://doi.org/10.48550/arXiv.2207.03153, https://doi.org/10.1145/3539813.3545139
Anand, A., Leonhardt, J., Rudra, K., & Anand, A. (2022). Supervised Contrastive Learning Approach for Contextual Ranking. In ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval (pp. 61-71) https://doi.org/10.48550/arXiv.2207.03153, https://doi.org/10.1145/3539813.3545139
Anand A, Leonhardt J, Rudra K, Anand A. Supervised Contrastive Learning Approach for Contextual Ranking. In ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. 2022. p. 61-71 doi: 10.48550/arXiv.2207.03153, 10.1145/3539813.3545139
Anand, Abhijit ; Leonhardt, Jurek ; Rudra, Koustav et al. / Supervised Contrastive Learning Approach for Contextual Ranking. ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. 2022. pp. 61-71
Download
@inproceedings{87a9195aa1eb4aa489e760fa505ba9b5,
title = "Supervised Contrastive Learning Approach for Contextual Ranking",
abstract = "Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).",
keywords = "data augmentation, document ranking, interpolation, ranking performance, supervised contrastive loss",
author = "Abhijit Anand and Jurek Leonhardt and Koustav Rudra and Avishek Anand",
note = "Funding Information: This work is supported by the European Union – Horizon 2020 Program under the scheme “INFRAIA-01-2018-2019 – Integrating Activities for Advanced Communities”, Grant Agreement n.871042, “SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu).; 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022 ; Conference date: 11-07-2022 Through 12-07-2022",
year = "2022",
month = aug,
day = "25",
doi = "10.48550/arXiv.2207.03153",
language = "English",
pages = "61--71",
booktitle = "ICTIR 2022",

}

Download

TY - GEN

T1 - Supervised Contrastive Learning Approach for Contextual Ranking

AU - Anand, Abhijit

AU - Leonhardt, Jurek

AU - Rudra, Koustav

AU - Anand, Avishek

N1 - Funding Information: This work is supported by the European Union – Horizon 2020 Program under the scheme “INFRAIA-01-2018-2019 – Integrating Activities for Advanced Communities”, Grant Agreement n.871042, “SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu).

PY - 2022/8/25

Y1 - 2022/8/25

N2 - Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).

AB - Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).

KW - data augmentation

KW - document ranking

KW - interpolation

KW - ranking performance

KW - supervised contrastive loss

UR - http://www.scopus.com/inward/record.url?scp=85138395726&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2207.03153

DO - 10.48550/arXiv.2207.03153

M3 - Conference contribution

AN - SCOPUS:85138395726

SP - 61

EP - 71

BT - ICTIR 2022

T2 - 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022

Y2 - 11 July 2022 through 12 July 2022

ER -