Ranking Archived Documents for Structured Queries on Semantic Layers

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksJCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten155-164
Seitenumfang10
ISBN (elektronisch)9781450351782
PublikationsstatusVeröffentlicht - 23 Mai 2018
Veranstaltung18th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2018 - Fort Worth, USA / Vereinigte Staaten
Dauer: 3 Juni 20187 Juni 2018

Publikationsreihe

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Abstract

Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this problem and formalize the task of ranking archived documents for structured queries on semantic layers. Then, we propose two ranking models for the problem at hand which jointly consider: i) the relativeness of documents to entities, ii) the timeliness of documents, and iii) the temporal relations among the entities. The experimental results on a new evaluation dataset show the effectiveness of the proposed models and allow us to understand their limitations.

ASJC Scopus Sachgebiete

Zitieren

Ranking Archived Documents for Structured Queries on Semantic Layers. / Fafalios, Pavlos; Kasturia, Vaibhav; Nejdl, Wolfgang.
JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc., 2018. S. 155-164 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Fafalios, P, Kasturia, V & Nejdl, W 2018, Ranking Archived Documents for Structured Queries on Semantic Layers. in JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, Institute of Electrical and Electronics Engineers Inc., S. 155-164, 18th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2018, Fort Worth, USA / Vereinigte Staaten, 3 Juni 2018. https://doi.org/10.1145/3197026.3197049
Fafalios, P., Kasturia, V., & Nejdl, W. (2018). Ranking Archived Documents for Structured Queries on Semantic Layers. In JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries (S. 155-164). (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/3197026.3197049
Fafalios P, Kasturia V, Nejdl W. Ranking Archived Documents for Structured Queries on Semantic Layers. in JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc. 2018. S. 155-164. (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries). doi: 10.1145/3197026.3197049
Fafalios, Pavlos ; Kasturia, Vaibhav ; Nejdl, Wolfgang. / Ranking Archived Documents for Structured Queries on Semantic Layers. JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries. Institute of Electrical and Electronics Engineers Inc., 2018. S. 155-164 (Proceedings of the ACM/IEEE Joint Conference on Digital Libraries).
Download
@inproceedings{81c661779a3346d89f30d8d7f4077e4b,
title = "Ranking Archived Documents for Structured Queries on Semantic Layers",
abstract = "Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this problem and formalize the task of ranking archived documents for structured queries on semantic layers. Then, we propose two ranking models for the problem at hand which jointly consider: i) the relativeness of documents to entities, ii) the timeliness of documents, and iii) the temporal relations among the entities. The experimental results on a new evaluation dataset show the effectiveness of the proposed models and allow us to understand their limitations.",
keywords = "archived documents, probabilistic modeling, ranking, semantic layers, stochastic modeling",
author = "Pavlos Fafalios and Vaibhav Kasturia and Wolfgang Nejdl",
note = "Funding information: The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA (No. 339233).; 18th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2018 ; Conference date: 03-06-2018 Through 07-06-2018",
year = "2018",
month = may,
day = "23",
doi = "10.1145/3197026.3197049",
language = "English",
series = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "155--164",
booktitle = "JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries",
address = "United States",

}

Download

TY - GEN

T1 - Ranking Archived Documents for Structured Queries on Semantic Layers

AU - Fafalios, Pavlos

AU - Kasturia, Vaibhav

AU - Nejdl, Wolfgang

N1 - Funding information: The work was partially funded by the European Commission for the ERC Advanced Grant ALEXANDRIA (No. 339233).

PY - 2018/5/23

Y1 - 2018/5/23

N2 - Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this problem and formalize the task of ranking archived documents for structured queries on semantic layers. Then, we propose two ranking models for the problem at hand which jointly consider: i) the relativeness of documents to entities, ii) the timeliness of documents, and iii) the temporal relations among the entities. The experimental results on a new evaluation dataset show the effectiveness of the proposed models and allow us to understand their limitations.

AB - Archived collections of documents (like newspaper and web archives) serve as important information sources in a variety of disciplines, including Digital Humanities, Historical Science, and Journalism. However, the absence of efficient and meaningful exploration methods still remains a major hurdle in the way of turning them into usable sources of information. A semantic layer is an RDF graph that describes metadata and semantic information about a collection of archived documents, which in turn can be queried through a semantic query language (SPARQL). This allows running advanced queries by combining metadata of the documents (like publication date) and content-based semantic information (like entities mentioned in the documents). However, the results returned by such structured queries can be numerous and moreover they all equally match the query. In this paper, we deal with this problem and formalize the task of ranking archived documents for structured queries on semantic layers. Then, we propose two ranking models for the problem at hand which jointly consider: i) the relativeness of documents to entities, ii) the timeliness of documents, and iii) the temporal relations among the entities. The experimental results on a new evaluation dataset show the effectiveness of the proposed models and allow us to understand their limitations.

KW - archived documents

KW - probabilistic modeling

KW - ranking

KW - semantic layers

KW - stochastic modeling

UR - http://www.scopus.com/inward/record.url?scp=85048856169&partnerID=8YFLogxK

U2 - 10.1145/3197026.3197049

DO - 10.1145/3197026.3197049

M3 - Conference contribution

AN - SCOPUS:85048856169

T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

SP - 155

EP - 164

BT - JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 18th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2018

Y2 - 3 June 2018 through 7 June 2018

ER -

Von denselben Autoren