Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings |
Untertitel | 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023 |
Herausgeber (Verlag) | IEEE Computer Society |
Seiten | 379-390 |
Seitenumfang | 12 |
ISBN (elektronisch) | 9798350315943 |
Publikationsstatus | Veröffentlicht - 2023 |
Veranstaltung | 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023 - Florence, Italien Dauer: 9 Okt. 2023 → 12 Okt. 2023 |
Publikationsreihe
Name | Proceedings - International Symposium on Software Reliability Engineering, ISSRE |
---|---|
ISSN (Print) | 1071-9458 |
Abstract
Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Ingenieurwesen (insg.)
- Sicherheit, Risiko, Zuverlässigkeit und Qualität
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings : 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society, 2023. S. 379-390 (Proceedings - International Symposium on Software Reliability Engineering, ISSRE).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Log Parsing Evaluation in the Era of Modern Software Systems
AU - Petrescu, Stefan
AU - Den Hengst, Floris
AU - Uta, Alexandru
AU - Rellermeyer, Jan S.
PY - 2023
Y1 - 2023
N2 - Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.
AB - Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, LOGCHIMERA, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing.
KW - automated log analysis
KW - log parsing
KW - reliability
UR - http://www.scopus.com/inward/record.url?scp=85178082534&partnerID=8YFLogxK
U2 - https://doi.org/10.48550/arXiv.2308.09003
DO - https://doi.org/10.48550/arXiv.2308.09003
M3 - Conference contribution
AN - SCOPUS:85178082534
T3 - Proceedings - International Symposium on Software Reliability Engineering, ISSRE
SP - 379
EP - 390
BT - Proceedings
PB - IEEE Computer Society
T2 - 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023
Y2 - 9 October 2023 through 12 October 2023
ER -