Details
Original language | English |
---|---|
Article number | 7240 |
Journal | Scientific reports |
Volume | 13 |
Publication status | Published - 4 May 2023 |
Abstract
Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.
ASJC Scopus subject areas
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Scientific reports, Vol. 13, 7240, 04.05.2023.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge
AU - Auer, Sören
AU - Barone, Dante A.C.
AU - Bartz, Cassiano
AU - Cortes, Eduardo G.
AU - Jaradeh, Mohamad Yaser
AU - Karras, Oliver
AU - Koubarakis, Manolis
AU - Mouromtsev, Dmitry
AU - Pliukhin, Dmitrii
AU - Radyush, Daniil
AU - Shilin, Ivan
AU - Stocker, Markus
AU - Tsalapati, Eleni
N1 - Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and by the German Federal Ministry of Education and Research (BMBF) under the project LeibnizKILabor (Grant no. 01DD20003), German Research Foundation DFG for NFDI4Ing (No. 442146713) and NFDI4DataScience (No. 460234259). It has, also, received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant agreement No. 101032307. It is, also, financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)-Finance Code 001.
PY - 2023/5/4
Y1 - 2023/5/4
N2 - Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.
AB - Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.
UR - http://www.scopus.com/inward/record.url?scp=85157959553&partnerID=8YFLogxK
U2 - 10.1038/s41598-023-33607-z
DO - 10.1038/s41598-023-33607-z
M3 - Article
C2 - 37142627
AN - SCOPUS:85157959553
VL - 13
JO - Scientific reports
JF - Scientific reports
SN - 2045-2322
M1 - 7240
ER -