Log Parsing Evaluation in the Era of Modern Software Systems

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

External Research Organisations

  • ING-DiBa AG
  • DFINITY Foundation
View graph of relations

Details

Original languageEnglish
Title of host publicationProceedings
Subtitle of host publication2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023
PublisherIEEE Computer Society
Pages379-390
Number of pages12
ISBN (electronic)9798350315943
Publication statusPublished - 2023
Event34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023 - Florence, Italy
Duration: 9 Oct 202312 Oct 2023

Publication series

NameProceedings - International Symposium on Software Reliability Engineering, ISSRE
ISSN (Print)1071-9458

Abstract

Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.

Keywords

    automated log analysis, log parsing, reliability

ASJC Scopus subject areas

Cite this

Log Parsing Evaluation in the Era of Modern Software Systems. / Petrescu, Stefan; Den Hengst, Floris; Uta, Alexandru et al.
Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society, 2023. p. 379-390 (Proceedings - International Symposium on Software Reliability Engineering, ISSRE).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Petrescu, S, Den Hengst, F, Uta, A & Rellermeyer, JS 2023, Log Parsing Evaluation in the Era of Modern Software Systems. in Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. Proceedings - International Symposium on Software Reliability Engineering, ISSRE, IEEE Computer Society, pp. 379-390, 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023, Florence, Italy, 9 Oct 2023. https://doi.org/10.48550/arXiv.2308.09003, https://doi.org/10.1109/ISSRE59848.2023.00019
Petrescu, S., Den Hengst, F., Uta, A., & Rellermeyer, J. S. (2023). Log Parsing Evaluation in the Era of Modern Software Systems. In Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023 (pp. 379-390). (Proceedings - International Symposium on Software Reliability Engineering, ISSRE). IEEE Computer Society. https://doi.org/10.48550/arXiv.2308.09003, https://doi.org/10.1109/ISSRE59848.2023.00019
Petrescu S, Den Hengst F, Uta A, Rellermeyer JS. Log Parsing Evaluation in the Era of Modern Software Systems. In Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society. 2023. p. 379-390. (Proceedings - International Symposium on Software Reliability Engineering, ISSRE). doi: https://doi.org/10.48550/arXiv.2308.09003, 10.1109/ISSRE59848.2023.00019
Petrescu, Stefan ; Den Hengst, Floris ; Uta, Alexandru et al. / Log Parsing Evaluation in the Era of Modern Software Systems. Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society, 2023. pp. 379-390 (Proceedings - International Symposium on Software Reliability Engineering, ISSRE).
Download
@inproceedings{206953d9e53d4e22b3950abbd9119fb0,
title = "Log Parsing Evaluation in the Era of Modern Software Systems",
abstract = "Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.",
keywords = "automated log analysis, log parsing, reliability",
author = "Stefan Petrescu and {Den Hengst}, Floris and Alexandru Uta and Rellermeyer, {Jan S.}",
year = "2023",
doi = "https://doi.org/10.48550/arXiv.2308.09003",
language = "English",
series = "Proceedings - International Symposium on Software Reliability Engineering, ISSRE",
publisher = "IEEE Computer Society",
pages = "379--390",
booktitle = "Proceedings",
address = "United States",
note = "34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023 ; Conference date: 09-10-2023 Through 12-10-2023",

}

Download

TY - GEN

T1 - Log Parsing Evaluation in the Era of Modern Software Systems

AU - Petrescu, Stefan

AU - Den Hengst, Floris

AU - Uta, Alexandru

AU - Rellermeyer, Jan S.

PY - 2023

Y1 - 2023

N2 - Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.

AB - Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.

KW - automated log analysis

KW - log parsing

KW - reliability

UR - http://www.scopus.com/inward/record.url?scp=85178082534&partnerID=8YFLogxK

U2 - https://doi.org/10.48550/arXiv.2308.09003

DO - https://doi.org/10.48550/arXiv.2308.09003

M3 - Conference contribution

AN - SCOPUS:85178082534

T3 - Proceedings - International Symposium on Software Reliability Engineering, ISSRE

SP - 379

EP - 390

BT - Proceedings

PB - IEEE Computer Society

T2 - 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023

Y2 - 9 October 2023 through 12 October 2023

ER -

By the same author(s)