Community Knowledge about Security: Identification and Classification of User Contributions

Fabien Patrick Viertel; Wasja Brunotte; Yannick Evers; Kurt Schneider

doi:10.1007/978-3-030-68887-5_11

Details

Original language	English
Title of host publication	Risks and Security of Internet and Systems
Subtitle of host publication	15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers
Editors	Joaquin Garcia-Alfaro, Jean Leneutre, Nora Cuppens, Reda Yaich
Pages	181-197
Number of pages	17
ISBN (electronic)	978-3-030-68887-5
Publication status	Published - 12 Feb 2021
Event	The 15th International Conference on Risks and Security of Internet and Systems - Online, France Duration: 3 Nov 2020 → 6 Nov 2020 https://www.crisis-conference.com/

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	12528 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

Keywords

Artificial intelligence, Clone detection, Community knowledge, Security, Source code

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

Community Knowledge about Security: Identification and Classification of User Contributions. / Viertel, Fabien Patrick; Brunotte, Wasja; Evers, Yannick et al.
Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. ed. / Joaquin Garcia-Alfaro; Jean Leneutre; Nora Cuppens; Reda Yaich. 2021. p. 181-197 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12528 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Viertel, FP, Brunotte, W, Evers, Y & Schneider, K 2021, Community Knowledge about Security: Identification and Classification of User Contributions. in J Garcia-Alfaro, J Leneutre, N Cuppens & R Yaich (eds), Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12528 LNCS, pp. 181-197, The 15th International Conference on Risks and Security of Internet and Systems, France, 3 Nov 2020. https://doi.org/10.1007/978-3-030-68887-5_11

Viertel, F. P., Brunotte, W., Evers, Y., & Schneider, K. (2021). Community Knowledge about Security: Identification and Classification of User Contributions. In J. Garcia-Alfaro, J. Leneutre, N. Cuppens, & R. Yaich (Eds.), Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers (pp. 181-197). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12528 LNCS). https://doi.org/10.1007/978-3-030-68887-5_11

Viertel FP, Brunotte W, Evers Y, Schneider K. Community Knowledge about Security: Identification and Classification of User Contributions. In Garcia-Alfaro J, Leneutre J, Cuppens N, Yaich R, editors, Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. 2021. p. 181-197. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-68887-5_11

Viertel, Fabien Patrick ; Brunotte, Wasja ; Evers, Yannick et al. / Community Knowledge about Security : Identification and Classification of User Contributions. Risks and Security of Internet and Systems: 15th International Conference, CRiSIS 2020, Paris, France, November 4–6, 2020, Revised Selected Papers. editor / Joaquin Garcia-Alfaro ; Jean Leneutre ; Nora Cuppens ; Reda Yaich. 2021. pp. 181-197 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{0a4c7b1d02ed4fc8ada0a57a8c39efb0,

title = "Community Knowledge about Security: Identification and Classification of User Contributions",

abstract = "Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.",

keywords = "Source Code, Security, Clone Detection, Community Knowledge, Artificial Intelligence, Artificial intelligence, Clone detection, Community knowledge, Security, Source code",

author = "Viertel, {Fabien Patrick} and Wasja Brunotte and Yannick Evers and Kurt Schneider",

year = "2021",

month = feb,

day = "12",

doi = "10.1007/978-3-030-68887-5_11",

language = "English",

isbn = "978-3-030-68886-8",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "181--197",

editor = "Joaquin Garcia-Alfaro and Jean Leneutre and Nora Cuppens and Reda Yaich",

booktitle = "Risks and Security of Internet and Systems",

note = "The 15th International Conference on Risks and Security of Internet and Systems ; Conference date: 03-11-2020 Through 06-11-2020",

url = "https://www.crisis-conference.com/",

}

Download

TY - GEN

T1 - Community Knowledge about Security

T2 - The 15th International Conference on Risks and Security of Internet and Systems

AU - Viertel, Fabien Patrick

AU - Brunotte, Wasja

AU - Evers, Yannick

AU - Schneider, Kurt

PY - 2021/2/12

Y1 - 2021/2/12

N2 - Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

AB - Nowadays, confidential data of users and companies are processed by various software applications. Therefore, it is necessary to protect them against security flaws in source code, which could, for example, allow the infringement of privacy. However, developers are usually not equipped with the required expertise to fulfill this task. To their rescue, there are tools like security code clone detectors to disclose vulnerable methods in source code. They try to find clones of written project code and vulnerable code fragments stored in a reference repository. Existing vulnerability databases, for instance the National Vulnerability Database (NVD), contain data on reported weaknesses, but the availability of example code for their occurrence, patch and exploit is scarce. Developers also use community websites to find help for secure implementations. In this paper, we propose a semi-automated process to extract security-related code from the Stack Exchange community network, where also the coding community Stack Overflow belongs. We classify the obtained code through artificial intelligence combined with natural language processing into the three security types: vulnerable, patch or exploit. In a twofold evaluation, we compared both parts with the manual activity of security experts. At first, for the search, our approach shows better precision than the experts as well as a moderate recall. Secondly, the results show that the classification of code fragments in security types is not quite easy. The investigated approaches and security experts perform with different strength regarding types of security.

KW - Source Code

KW - Security

KW - Clone Detection

KW - Community Knowledge

KW - Artificial Intelligence

KW - Artificial intelligence

KW - Clone detection

KW - Community knowledge

KW - Security

KW - Source code

UR - http://www.scopus.com/inward/record.url?scp=85102652194&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-68887-5_11

DO - 10.1007/978-3-030-68887-5_11

M3 - Conference contribution

SN - 978-3-030-68886-8

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 181

EP - 197

BT - Risks and Security of Internet and Systems

A2 - Garcia-Alfaro, Joaquin

A2 - Leneutre, Jean

A2 - Cuppens, Nora

A2 - Yaich, Reda

Y2 - 3 November 2020 through 6 November 2020

ER -

Research@Leibniz University

Community Knowledge about Security: Identification and Classification of User Contributions

Authors

Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Self-Elicitation of Requirements with Automated GUI Prototyping.

What you see is what you trace: a two-stage interview study on traceability practices and eye tracking potential

Paving the Way Towards an Effective Vision Video Usage: An Exploratory Study

Organizing Graphical User Interface tests from behavior‐driven development as videos to obtain stakeholders' feedback

Supporting Value-Aware Software Engineering Through Traceability and Value Tactics

Self-Elicitation of Requirements with Automated GUI Prototyping.

What you see is what you trace: a two-stage interview study on traceability practices and eye tracking potential

Paving the Way Towards an Effective Vision Video Usage: An Exploratory Study

Organizing Graphical User Interface tests from behavior‐driven development as videos to obtain stakeholders' feedback

Supporting Value-Aware Software Engineering Through Traceability and Value Tactics

Self-Elicitation of Requirements with Automated GUI Prototyping.