Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | ICTIR 2022 |
Untertitel | Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval |
Seiten | 61-71 |
Seitenumfang | 11 |
ISBN (elektronisch) | 9781450394123 |
Publikationsstatus | Veröffentlicht - 25 Aug. 2022 |
Veranstaltung | 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022 - Virtual, Online, Spanien Dauer: 11 Juli 2022 → 12 Juli 2022 |
Abstract
Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Informatik (sonstige)
- Informatik (insg.)
- Information systems
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
ICTIR 2022 : Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. 2022. S. 61-71.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Supervised Contrastive Learning Approach for Contextual Ranking
AU - Anand, Abhijit
AU - Leonhardt, Jurek
AU - Rudra, Koustav
AU - Anand, Avishek
N1 - Funding Information: This work is supported by the European Union – Horizon 2020 Program under the scheme “INFRAIA-01-2018-2019 – Integrating Activities for Advanced Communities”, Grant Agreement n.871042, “SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu).
PY - 2022/8/25
Y1 - 2022/8/25
N2 - Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).
AB - Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine tuning. This paper proposes a simple yet effective method to improve ranking performance on smaller datasets using supervised contrastive learning for the document ranking problem. We perform data augmentation by creating training data using parts of the relevant documents in the query-document pairs. We then use a supervised contrastive learning objective to learn an effective ranking model from the augmented dataset. Our experiments on subsets of the TREC-DL dataset show that, although data augmentation leads to an increasing the training data sizes, it does not necessarily improve the performance using existing pointwise or pairwise training objectives. However, our proposed supervised contrastive loss objective leads to performance improvements over the standard non-augmented setting showcasing the utility of data augmentation using contrastive losses. Finally, we show the real benefit of using supervised contrastive learning objectives by showing marked improvements in smaller ranking datasets relating to news (Robust04), finance (FiQA), and scientific fact checking (SciFact).
KW - data augmentation
KW - document ranking
KW - interpolation
KW - ranking performance
KW - supervised contrastive loss
UR - http://www.scopus.com/inward/record.url?scp=85138395726&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2207.03153
DO - 10.48550/arXiv.2207.03153
M3 - Conference contribution
AN - SCOPUS:85138395726
SP - 61
EP - 71
BT - ICTIR 2022
T2 - 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2022
Y2 - 11 July 2022 through 12 July 2022
ER -