Details
Original language | English |
---|---|
Title of host publication | 3rd International Conference on Web Intelligence, Mining and Semantics, WIMS 2013 |
Publication status | Published - 2013 |
Event | 3rd International Conference on Web Intelligence, Mining and Semantics, WIMS 2013 - Madrid, Spain Duration: 12 Jun 2013 → 14 Jun 2013 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Abstract
Document classification is key to ensuring quality of any digital library. However, classifying documents is a very time-consuming task. In addition, few or none of the documents in a newly created repository are classified. The non-classification of documents not only prevents users from finding information but also hinders the system's aptitude to recommend relevant items. Moreover, the lack of classified documents prevents any kind of machine learning algorithm to automatically annotate these items. In this work, we propose a novel approach to automatically classifying documents that differs from previous works in the sense that it exploits the wisdom of the crowds available on theWeb. Our proposed strategy adapts an automatic tagging approach combined with a straightforward matching algorithm to classify documents in a given domain classification. To validate our findings, we compared our methods against the existing and performed a user evaluation with 61 participants to estimate the quality of the classifications. Results show that, in 72% of the cases, the automatic classification is relevant and well accepted by participants. In conclusion, automatic classification can facilitate access to relevant documents.
Keywords
- Automatic classification, Cold-start, Digital libraries, Information retrieval, User evaluation
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Computer Vision and Pattern Recognition
- Computer Science(all)
- Computer Networks and Communications
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
3rd International Conference on Web Intelligence, Mining and Semantics, WIMS 2013. 2013. 19 (ACM International Conference Proceeding Series).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Automatic classification of documents in cold-start scenarios
AU - Kawase, Ricardo
AU - Fisichella, Marco
AU - Nunes, Bernardo Pereira
AU - Ha, Kyung Hun
AU - Bick, Markus
PY - 2013
Y1 - 2013
N2 - Document classification is key to ensuring quality of any digital library. However, classifying documents is a very time-consuming task. In addition, few or none of the documents in a newly created repository are classified. The non-classification of documents not only prevents users from finding information but also hinders the system's aptitude to recommend relevant items. Moreover, the lack of classified documents prevents any kind of machine learning algorithm to automatically annotate these items. In this work, we propose a novel approach to automatically classifying documents that differs from previous works in the sense that it exploits the wisdom of the crowds available on theWeb. Our proposed strategy adapts an automatic tagging approach combined with a straightforward matching algorithm to classify documents in a given domain classification. To validate our findings, we compared our methods against the existing and performed a user evaluation with 61 participants to estimate the quality of the classifications. Results show that, in 72% of the cases, the automatic classification is relevant and well accepted by participants. In conclusion, automatic classification can facilitate access to relevant documents.
AB - Document classification is key to ensuring quality of any digital library. However, classifying documents is a very time-consuming task. In addition, few or none of the documents in a newly created repository are classified. The non-classification of documents not only prevents users from finding information but also hinders the system's aptitude to recommend relevant items. Moreover, the lack of classified documents prevents any kind of machine learning algorithm to automatically annotate these items. In this work, we propose a novel approach to automatically classifying documents that differs from previous works in the sense that it exploits the wisdom of the crowds available on theWeb. Our proposed strategy adapts an automatic tagging approach combined with a straightforward matching algorithm to classify documents in a given domain classification. To validate our findings, we compared our methods against the existing and performed a user evaluation with 61 participants to estimate the quality of the classifications. Results show that, in 72% of the cases, the automatic classification is relevant and well accepted by participants. In conclusion, automatic classification can facilitate access to relevant documents.
KW - Automatic classification
KW - Cold-start
KW - Digital libraries
KW - Information retrieval
KW - User evaluation
UR - http://www.scopus.com/inward/record.url?scp=84879751815&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84879751815
SN - 9781450318501
T3 - ACM International Conference Proceeding Series
BT - 3rd International Conference on Web Intelligence, Mining and Semantics, WIMS 2013
T2 - 3rd International Conference on Web Intelligence, Mining and Semantics, WIMS 2013
Y2 - 12 June 2013 through 14 June 2013
ER -