Extracting topics from open educational resources

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Mohammadreza Molavi
  • Mohammadreza Tavakoli
  • Gábor Kismihók

External Research Organisations

  • Amirkabir University of Technology
  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationAddressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings
EditorsCarlos Alario-Hoyos, María Jesús Rodríguez-Triana, Maren Scheffel, Inmaculada Arnedillo-Sánchez, Sebastian Maximilian Dennerlein
PublisherSpringer Science and Business Media Deutschland GmbH
Pages455-460
Number of pages6
ISBN (print)9783030577162
Publication statusPublished - 2020
Externally publishedYes
Event15th European Conference on Technology Enhanced Learning, EC-TEL 2020 - Heidelberg, Germany
Duration: 14 Sept 202018 Sept 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12315 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

OERs have high-potential to satisfy learners in many different circumstances, as they are available in a wide range of contexts. However, the low-quality of OER metadata, in general, is one of the main reasons behind the lack of personalised, OER based services such as search and recommendation. As a result, the applicability of OERs remains limited. Nevertheless, OER metadata about covered topics (subjects) is essentially required by learners to build effective learning pathways towards their individual learning objectives. Therefore, in this paper, we report on a work in progress project proposing an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution. This is done by: 1) collecting 27 lectures from Coursera and Khan Academy in the area of an important skill in the area of Data Science (i.e. Text Mining as our first focus), 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to the skill, and 3) defining topic distributions covered by a particular OER. To evaluate our model, we used the data-set of educational resources from Youtube, and compared our topic distribution results with their manually defined target topics with the help of 3 experts in the area of data science. As a result, our model extracted topics with 76% of F1-score.

Keywords

    Machine learning, OER, Open Educational Resource, Text mining, Topic extraction

ASJC Scopus subject areas

Cite this

Extracting topics from open educational resources. / Molavi, Mohammadreza; Tavakoli, Mohammadreza; Kismihók, Gábor.
Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings. ed. / Carlos Alario-Hoyos; María Jesús Rodríguez-Triana; Maren Scheffel; Inmaculada Arnedillo-Sánchez; Sebastian Maximilian Dennerlein. Springer Science and Business Media Deutschland GmbH, 2020. p. 455-460 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12315 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Molavi, M, Tavakoli, M & Kismihók, G 2020, Extracting topics from open educational resources. in C Alario-Hoyos, MJ Rodríguez-Triana, M Scheffel, I Arnedillo-Sánchez & SM Dennerlein (eds), Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12315 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 455-460, 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Heidelberg, Germany, 14 Sept 2020. https://doi.org/10.48550/arXiv.2006.11109, https://doi.org/10.1007/978-3-030-57717-9_44
Molavi, M., Tavakoli, M., & Kismihók, G. (2020). Extracting topics from open educational resources. In C. Alario-Hoyos, M. J. Rodríguez-Triana, M. Scheffel, I. Arnedillo-Sánchez, & S. M. Dennerlein (Eds.), Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings (pp. 455-460). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12315 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.48550/arXiv.2006.11109, https://doi.org/10.1007/978-3-030-57717-9_44
Molavi M, Tavakoli M, Kismihók G. Extracting topics from open educational resources. In Alario-Hoyos C, Rodríguez-Triana MJ, Scheffel M, Arnedillo-Sánchez I, Dennerlein SM, editors, Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings. Springer Science and Business Media Deutschland GmbH. 2020. p. 455-460. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2020 Sept 7. doi: 10.48550/arXiv.2006.11109, 10.1007/978-3-030-57717-9_44
Molavi, Mohammadreza ; Tavakoli, Mohammadreza ; Kismihók, Gábor. / Extracting topics from open educational resources. Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings. editor / Carlos Alario-Hoyos ; María Jesús Rodríguez-Triana ; Maren Scheffel ; Inmaculada Arnedillo-Sánchez ; Sebastian Maximilian Dennerlein. Springer Science and Business Media Deutschland GmbH, 2020. pp. 455-460 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{fa4d723116f3424b9d4e23611d4449d4,
title = "Extracting topics from open educational resources",
abstract = "OERs have high-potential to satisfy learners in many different circumstances, as they are available in a wide range of contexts. However, the low-quality of OER metadata, in general, is one of the main reasons behind the lack of personalised, OER based services such as search and recommendation. As a result, the applicability of OERs remains limited. Nevertheless, OER metadata about covered topics (subjects) is essentially required by learners to build effective learning pathways towards their individual learning objectives. Therefore, in this paper, we report on a work in progress project proposing an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution. This is done by: 1) collecting 27 lectures from Coursera and Khan Academy in the area of an important skill in the area of Data Science (i.e. Text Mining as our first focus), 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to the skill, and 3) defining topic distributions covered by a particular OER. To evaluate our model, we used the data-set of educational resources from Youtube, and compared our topic distribution results with their manually defined target topics with the help of 3 experts in the area of data science. As a result, our model extracted topics with 76% of F1-score.",
keywords = "Machine learning, OER, Open Educational Resource, Text mining, Topic extraction",
author = "Mohammadreza Molavi and Mohammadreza Tavakoli and G{\'a}bor Kismih{\'o}k",
year = "2020",
doi = "10.48550/arXiv.2006.11109",
language = "English",
isbn = "9783030577162",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "455--460",
editor = "Carlos Alario-Hoyos and Rodr{\'i}guez-Triana, {Mar{\'i}a Jes{\'u}s} and Maren Scheffel and Inmaculada Arnedillo-S{\'a}nchez and Dennerlein, {Sebastian Maximilian}",
booktitle = "Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings",
address = "Germany",
note = "15th European Conference on Technology Enhanced Learning, EC-TEL 2020 ; Conference date: 14-09-2020 Through 18-09-2020",

}

Download

TY - GEN

T1 - Extracting topics from open educational resources

AU - Molavi, Mohammadreza

AU - Tavakoli, Mohammadreza

AU - Kismihók, Gábor

PY - 2020

Y1 - 2020

N2 - OERs have high-potential to satisfy learners in many different circumstances, as they are available in a wide range of contexts. However, the low-quality of OER metadata, in general, is one of the main reasons behind the lack of personalised, OER based services such as search and recommendation. As a result, the applicability of OERs remains limited. Nevertheless, OER metadata about covered topics (subjects) is essentially required by learners to build effective learning pathways towards their individual learning objectives. Therefore, in this paper, we report on a work in progress project proposing an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution. This is done by: 1) collecting 27 lectures from Coursera and Khan Academy in the area of an important skill in the area of Data Science (i.e. Text Mining as our first focus), 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to the skill, and 3) defining topic distributions covered by a particular OER. To evaluate our model, we used the data-set of educational resources from Youtube, and compared our topic distribution results with their manually defined target topics with the help of 3 experts in the area of data science. As a result, our model extracted topics with 76% of F1-score.

AB - OERs have high-potential to satisfy learners in many different circumstances, as they are available in a wide range of contexts. However, the low-quality of OER metadata, in general, is one of the main reasons behind the lack of personalised, OER based services such as search and recommendation. As a result, the applicability of OERs remains limited. Nevertheless, OER metadata about covered topics (subjects) is essentially required by learners to build effective learning pathways towards their individual learning objectives. Therefore, in this paper, we report on a work in progress project proposing an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution. This is done by: 1) collecting 27 lectures from Coursera and Khan Academy in the area of an important skill in the area of Data Science (i.e. Text Mining as our first focus), 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to the skill, and 3) defining topic distributions covered by a particular OER. To evaluate our model, we used the data-set of educational resources from Youtube, and compared our topic distribution results with their manually defined target topics with the help of 3 experts in the area of data science. As a result, our model extracted topics with 76% of F1-score.

KW - Machine learning

KW - OER

KW - Open Educational Resource

KW - Text mining

KW - Topic extraction

UR - http://www.scopus.com/inward/record.url?scp=85091181893&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2006.11109

DO - 10.48550/arXiv.2006.11109

M3 - Conference contribution

AN - SCOPUS:85091181893

SN - 9783030577162

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 455

EP - 460

BT - Addressing Global Challenges and Quality Education - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020, Proceedings

A2 - Alario-Hoyos, Carlos

A2 - Rodríguez-Triana, María Jesús

A2 - Scheffel, Maren

A2 - Arnedillo-Sánchez, Inmaculada

A2 - Dennerlein, Sebastian Maximilian

PB - Springer Science and Business Media Deutschland GmbH

T2 - 15th European Conference on Technology Enhanced Learning, EC-TEL 2020

Y2 - 14 September 2020 through 18 September 2020

ER -