MedTable: Extracting Disease Types from Web Tables

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Maria Koutraki
  • Besnik Fetahu

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationThe Semantic Web
Subtitle of host publicationESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers
EditorsAndreas Harth, Valentina Presutti, Raphaël Troncy, Maribel Acosta, Axel Polleres, Javier D. Fernández, Josiane Xavier Parreira, Olaf Hartig, Katja Hose, Michael Cochez
PublisherSpringer Science and Business Media Deutschland GmbH
Pages152-157
Number of pages6
ISBN (electronic)978-3-030-62327-2
ISBN (print)9783030623265
Publication statusPublished - 11 Nov 2020
Event17th Extended Semantic Web Conference, ESWC 2020 - Heraklion, Greece
Duration: 31 May 20204 Jun 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12124 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Diseases and their symptoms are a frequent information need for Web users. Diseases often are categorized into sub-types, manifested through different symptoms. Extracting such information from textual corpora is inherently difficult. Yet, this can be easily extracted from semi-structured resources like tables. We propose an approach for identifying tables that contain information about sub-type classifications and their attributes. Often tables have diverse and redundant schemas, hence, we align equivalent columns in disparate schemas s.t. information about diseases are accessible through a unified and a common schema. Experimental evaluation shows that we can accurately identify tables containing disease sub-type classifications and additionally align equivalent columns.

ASJC Scopus subject areas

Cite this

MedTable: Extracting Disease Types from Web Tables. / Koutraki, Maria; Fetahu, Besnik.
The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers. ed. / Andreas Harth; Valentina Presutti; Raphaël Troncy; Maribel Acosta; Axel Polleres; Javier D. Fernández; Josiane Xavier Parreira; Olaf Hartig; Katja Hose; Michael Cochez. Springer Science and Business Media Deutschland GmbH, 2020. p. 152-157 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12124 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Koutraki, M & Fetahu, B 2020, MedTable: Extracting Disease Types from Web Tables. in A Harth, V Presutti, R Troncy, M Acosta, A Polleres, JD Fernández, J Xavier Parreira, O Hartig, K Hose & M Cochez (eds), The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12124 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 152-157, 17th Extended Semantic Web Conference, ESWC 2020, Heraklion, Greece, 31 May 2020. https://doi.org/10.1007/978-3-030-62327-2_26
Koutraki, M., & Fetahu, B. (2020). MedTable: Extracting Disease Types from Web Tables. In A. Harth, V. Presutti, R. Troncy, M. Acosta, A. Polleres, J. D. Fernández, J. Xavier Parreira, O. Hartig, K. Hose, & M. Cochez (Eds.), The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers (pp. 152-157). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12124 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-62327-2_26
Koutraki M, Fetahu B. MedTable: Extracting Disease Types from Web Tables. In Harth A, Presutti V, Troncy R, Acosta M, Polleres A, Fernández JD, Xavier Parreira J, Hartig O, Hose K, Cochez M, editors, The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers. Springer Science and Business Media Deutschland GmbH. 2020. p. 152-157. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-62327-2_26
Koutraki, Maria ; Fetahu, Besnik. / MedTable : Extracting Disease Types from Web Tables. The Semantic Web: ESWC 2020 Satellite Events - ESWC 2020, Revised Selected Papers. editor / Andreas Harth ; Valentina Presutti ; Raphaël Troncy ; Maribel Acosta ; Axel Polleres ; Javier D. Fernández ; Josiane Xavier Parreira ; Olaf Hartig ; Katja Hose ; Michael Cochez. Springer Science and Business Media Deutschland GmbH, 2020. pp. 152-157 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{64072814015448138a2836b6af1b377f,
title = "MedTable: Extracting Disease Types from Web Tables",
abstract = "Diseases and their symptoms are a frequent information need for Web users. Diseases often are categorized into sub-types, manifested through different symptoms. Extracting such information from textual corpora is inherently difficult. Yet, this can be easily extracted from semi-structured resources like tables. We propose an approach for identifying tables that contain information about sub-type classifications and their attributes. Often tables have diverse and redundant schemas, hence, we align equivalent columns in disparate schemas s.t. information about diseases are accessible through a unified and a common schema. Experimental evaluation shows that we can accurately identify tables containing disease sub-type classifications and additionally align equivalent columns.",
author = "Maria Koutraki and Besnik Fetahu",
year = "2020",
month = nov,
day = "11",
doi = "10.1007/978-3-030-62327-2_26",
language = "English",
isbn = "9783030623265",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "152--157",
editor = "Andreas Harth and Valentina Presutti and Rapha{\"e}l Troncy and Maribel Acosta and Axel Polleres and Fern{\'a}ndez, {Javier D.} and {Xavier Parreira}, Josiane and Olaf Hartig and Katja Hose and Michael Cochez",
booktitle = "The Semantic Web",
address = "Germany",
note = "17th Extended Semantic Web Conference, ESWC 2020 ; Conference date: 31-05-2020 Through 04-06-2020",

}

Download

TY - GEN

T1 - MedTable

T2 - 17th Extended Semantic Web Conference, ESWC 2020

AU - Koutraki, Maria

AU - Fetahu, Besnik

PY - 2020/11/11

Y1 - 2020/11/11

N2 - Diseases and their symptoms are a frequent information need for Web users. Diseases often are categorized into sub-types, manifested through different symptoms. Extracting such information from textual corpora is inherently difficult. Yet, this can be easily extracted from semi-structured resources like tables. We propose an approach for identifying tables that contain information about sub-type classifications and their attributes. Often tables have diverse and redundant schemas, hence, we align equivalent columns in disparate schemas s.t. information about diseases are accessible through a unified and a common schema. Experimental evaluation shows that we can accurately identify tables containing disease sub-type classifications and additionally align equivalent columns.

AB - Diseases and their symptoms are a frequent information need for Web users. Diseases often are categorized into sub-types, manifested through different symptoms. Extracting such information from textual corpora is inherently difficult. Yet, this can be easily extracted from semi-structured resources like tables. We propose an approach for identifying tables that contain information about sub-type classifications and their attributes. Often tables have diverse and redundant schemas, hence, we align equivalent columns in disparate schemas s.t. information about diseases are accessible through a unified and a common schema. Experimental evaluation shows that we can accurately identify tables containing disease sub-type classifications and additionally align equivalent columns.

UR - http://www.scopus.com/inward/record.url?scp=85097277511&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-62327-2_26

DO - 10.1007/978-3-030-62327-2_26

M3 - Conference contribution

AN - SCOPUS:85097277511

SN - 9783030623265

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 152

EP - 157

BT - The Semantic Web

A2 - Harth, Andreas

A2 - Presutti, Valentina

A2 - Troncy, Raphaël

A2 - Acosta, Maribel

A2 - Polleres, Axel

A2 - Fernández, Javier D.

A2 - Xavier Parreira, Josiane

A2 - Hartig, Olaf

A2 - Hose, Katja

A2 - Cochez, Michael

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 31 May 2020 through 4 June 2020

ER -