Community Knowledge about Security: Identification and Classification of User Contributions

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationRisks and Security of Internet and Systems
Subtitle of host publication15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers
EditorsJoaquin Garcia-Alfaro, Jean Leneutre, Nora Cuppens, Reda Yaich
Pages181-197
Number of pages17
ISBN (electronic)978-3-030-68887-5
Publication statusPublished - 12 Feb 2021
EventThe 15th International Conference on Risks and Security of Internet and Systems - Online, France
Duration: 3 Nov 20206 Nov 2020
https://www.crisis-conference.com/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12528 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

Keywords

    Artificial intelligence, Clone detection, Community knowledge, Security, Source code

ASJC Scopus subject areas

Cite this

Community Knowledge about Security: Identification and Classification of User Contributions. / Viertel, Fabien Patrick; Brunotte, Wasja; Evers, Yannick et al.
Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. ed. / Joaquin Garcia-Alfaro; Jean Leneutre; Nora Cuppens; Reda Yaich. 2021. p. 181-197 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12528 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Viertel, FP, Brunotte, W, Evers, Y & Schneider, K 2021, Community Knowledge about Security: Identification and Classification of User Contributions. in J Garcia-Alfaro, J Leneutre, N Cuppens & R Yaich (eds), Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12528 LNCS, pp. 181-197, The 15th International Conference on Risks and Security of Internet and Systems, France, 3 Nov 2020. https://doi.org/10.1007/978-3-030-68887-5_11
Viertel, F. P., Brunotte, W., Evers, Y., & Schneider, K. (2021). Community Knowledge about Security: Identification and Classification of User Contributions. In J. Garcia-Alfaro, J. Leneutre, N. Cuppens, & R. Yaich (Eds.), Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers (pp. 181-197). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12528 LNCS). https://doi.org/10.1007/978-3-030-68887-5_11
Viertel FP, Brunotte W, Evers Y, Schneider K. Community Knowledge about Security: Identification and Classification of User Contributions. In Garcia-Alfaro J, Leneutre J, Cuppens N, Yaich R, editors, Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. 2021. p. 181-197. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-68887-5_11
Viertel, Fabien Patrick ; Brunotte, Wasja ; Evers, Yannick et al. / Community Knowledge about Security : Identification and Classification of User Contributions. Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. editor / Joaquin Garcia-Alfaro ; Jean Leneutre ; Nora Cuppens ; Reda Yaich. 2021. pp. 181-197 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{0a4c7b1d02ed4fc8ada0a57a8c39efb0,
title = "Community Knowledge about Security: Identification and Classification of User Contributions",
abstract = "Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.",
keywords = "Source Code, Security, Clone Detection, Community Knowledge, Artificial Intelligence, Artificial intelligence, Clone detection, Community knowledge, Security, Source code",
author = "Viertel, {Fabien Patrick} and Wasja Brunotte and Yannick Evers and Kurt Schneider",
year = "2021",
month = feb,
day = "12",
doi = "10.1007/978-3-030-68887-5_11",
language = "English",
isbn = "978-3-030-68886-8",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "181--197",
editor = "Joaquin Garcia-Alfaro and Jean Leneutre and Nora Cuppens and Reda Yaich",
booktitle = "Risks and Security of Internet and Systems",
note = "The 15th International Conference on Risks and Security of Internet and Systems ; Conference date: 03-11-2020 Through 06-11-2020",
url = "https://www.crisis-conference.com/",

}

Download

TY - GEN

T1 - Community Knowledge about Security

T2 - The 15th International Conference on Risks and Security of Internet and Systems

AU - Viertel, Fabien Patrick

AU - Brunotte, Wasja

AU - Evers, Yannick

AU - Schneider, Kurt

PY - 2021/2/12

Y1 - 2021/2/12

N2 - Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

AB - Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

KW - Source Code

KW - Security

KW - Clone Detection

KW - Community Knowledge

KW - Artificial Intelligence

KW - Artificial intelligence

KW - Clone detection

KW - Community knowledge

KW - Security

KW - Source code

UR - http://www.scopus.com/inward/record.url?scp=85102652194&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-68887-5_11

DO - 10.1007/978-3-030-68887-5_11

M3 - Conference contribution

SN - 978-3-030-68886-8

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 181

EP - 197

BT - Risks and Security of Internet and Systems

A2 - Garcia-Alfaro, Joaquin

A2 - Leneutre, Jean

A2 - Cuppens, Nora

A2 - Yaich, Reda

Y2 - 3 November 2020 through 6 November 2020

ER -

By the same author(s)