Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | From Born-Physical to Born-Virtual |
Untertitel | Augmenting Intelligence in Digital Libraries - 24th International Conference on Asian Digital Libraries, ICADL 2022, Proceedings |
Herausgeber/-innen | Yuen-Hsien Tseng, Marie Katsurai, Hoa N. Nguyen |
Herausgeber (Verlag) | Springer Science and Business Media Deutschland GmbH |
Seiten | 301-310 |
Seitenumfang | 10 |
ISBN (Print) | 9783031217555 |
Publikationsstatus | Veröffentlicht - 7 Dez. 2022 |
Veranstaltung | 24th International Conference on Asia-Pacific Digital Libraries, ICADL 2022 - Hanoi, Vietnam Dauer: 30 Nov. 2022 → 2 Dez. 2022 |
Publikationsreihe
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 13636 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (elektronisch) | 1611-3349 |
Abstract
A plethora of scientific software packages are published in repositories, e.g., Zenodo and figshare. These software packages are crucial for the reproducibility of published research. As an additional route to scholarly knowledge graph construction, we propose an approach for automated extraction of machine actionable (structured) scholarly knowledge from published software packages by static analysis of their (meta)data and contents (in particular scripts in languages such as Python). The approach can be summarized as follows. First, we extract metadata information (software description, programming languages, related references) from software packages by leveraging the Software Metadata Extraction Framework (SOMEF) and the GitHub API. Second, we analyze the extracted metadata to find the research articles associated with the corresponding software repository. Third, for software contained in published packages, we create and analyze the Abstract Syntax Tree (AST) representation to extract information about the procedures performed on data. Fourth, we search the extracted information in the full text of related articles to constrain the extracted information to scholarly knowledge, i.e. information published in the scholarly literature. Finally, we publish the extracted machine actionable scholarly knowledge in the Open Research Knowledge Graph (ORKG).
ASJC Scopus Sachgebiete
- Mathematik (insg.)
- Theoretische Informatik
- Informatik (insg.)
- Allgemeine Computerwissenschaft
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries - 24th International Conference on Asian Digital Libraries, ICADL 2022, Proceedings. Hrsg. / Yuen-Hsien Tseng; Marie Katsurai; Hoa N. Nguyen. Springer Science and Business Media Deutschland GmbH, 2022. S. 301-310 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 13636 LNCS).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Scholarly Knowledge Extraction from Published Software Packages
AU - Haris, Muhammad
AU - Stocker, Markus
AU - Auer, Sören
N1 - Funding Information: Acknowledgment. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and TIB–Leibniz Information Centre for Science and Technology.
PY - 2022/12/7
Y1 - 2022/12/7
N2 - A plethora of scientific software packages are published in repositories, e.g., Zenodo and figshare. These software packages are crucial for the reproducibility of published research. As an additional route to scholarly knowledge graph construction, we propose an approach for automated extraction of machine actionable (structured) scholarly knowledge from published software packages by static analysis of their (meta)data and contents (in particular scripts in languages such as Python). The approach can be summarized as follows. First, we extract metadata information (software description, programming languages, related references) from software packages by leveraging the Software Metadata Extraction Framework (SOMEF) and the GitHub API. Second, we analyze the extracted metadata to find the research articles associated with the corresponding software repository. Third, for software contained in published packages, we create and analyze the Abstract Syntax Tree (AST) representation to extract information about the procedures performed on data. Fourth, we search the extracted information in the full text of related articles to constrain the extracted information to scholarly knowledge, i.e. information published in the scholarly literature. Finally, we publish the extracted machine actionable scholarly knowledge in the Open Research Knowledge Graph (ORKG).
AB - A plethora of scientific software packages are published in repositories, e.g., Zenodo and figshare. These software packages are crucial for the reproducibility of published research. As an additional route to scholarly knowledge graph construction, we propose an approach for automated extraction of machine actionable (structured) scholarly knowledge from published software packages by static analysis of their (meta)data and contents (in particular scripts in languages such as Python). The approach can be summarized as follows. First, we extract metadata information (software description, programming languages, related references) from software packages by leveraging the Software Metadata Extraction Framework (SOMEF) and the GitHub API. Second, we analyze the extracted metadata to find the research articles associated with the corresponding software repository. Third, for software contained in published packages, we create and analyze the Abstract Syntax Tree (AST) representation to extract information about the procedures performed on data. Fourth, we search the extracted information in the full text of related articles to constrain the extracted information to scholarly knowledge, i.e. information published in the scholarly literature. Finally, we publish the extracted machine actionable scholarly knowledge in the Open Research Knowledge Graph (ORKG).
KW - Abstract syntax tree
KW - Analyzing software packages
KW - Code analysis
KW - Machine actionability
KW - Open research knowledge graph
KW - Scholarly communication
UR - http://www.scopus.com/inward/record.url?scp=85145010085&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2212.07921
DO - 10.48550/arXiv.2212.07921
M3 - Conference contribution
AN - SCOPUS:85145010085
SN - 9783031217555
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 301
EP - 310
BT - From Born-Physical to Born-Virtual
A2 - Tseng, Yuen-Hsien
A2 - Katsurai, Marie
A2 - Nguyen, Hoa N.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 24th International Conference on Asia-Pacific Digital Libraries, ICADL 2022
Y2 - 30 November 2022 through 2 December 2022
ER -