Criteria and Metrics for the Explainability of Software

Hannah Luca Deters

Details

Originalsprache	Englisch
Qualifikation	Master of Science
Gradverleihende Hochschule	Leibniz Universität Hannover
Betreut von	Schneider, K., Betreuer*in
Erscheinungsort	Hannover
Publikationsstatus	Veröffentlicht - 28 Sept. 2022

Abstract

In this master thesis, a concept for the evaluation of explainability in software
systems was developed. For this purpose, a comprehensive literature review
was conducted in which 86 relevant papers were obtained from an initial
set of 1025 papers. These papers contributed to the conceptualization of
the evaluation method. During this conceptualization, it was found that
the characteristics of explainability are strongly linked to the objective that
the explanations are supposed to achieve. It became clear that it is not
possible to achieve a satisfactory result if the evaluation of explainability
does not take these objectives into account. What has also been noticed
is that the literature already provides methods for evaluating single aspects
of explainability, but these consist almost exclusively of user studies. Since
conducting multiple user studies would be unrealistically expensive in non-
research settings, heuristics were developed to provide a first estimate of
explainability. Overall, an overarching concept was developed that links the
definition of objectives, the initial assessment with heuristics, and the more
reliable evaluation with user studies.
In the second part of the master’s thesis, a user study was conducted to
evaluate whether the developed heuristics produce reliable results. For this
purpose, the interrater agreement was examined to see whether the heuristics
allow uniform ratings. It was found that a group of evaluators together can
produce a uniform result. Significance tests were then used to determine
whether the heuristics are capable of identifying significant differences in the
explainability of two systems. It was found that significant differences were
revealed within the different aspects of explainability.

ASJC Scopus Sachgebiete

Informatik (insg.)
Allgemeine Computerwissenschaft

Zitieren

Criteria and Metrics for the Explainability of Software. / Deters, Hannah Luca.
Hannover, 2022. 114 S.

Publikation: Qualifikations-/Studienabschlussarbeit › Masterarbeit

Deters, HL 2022, 'Criteria and Metrics for the Explainability of Software', Master of Science, Gottfried Wilhelm Leibniz Universität Hannover, Hannover. <https://www.pi.uni-hannover.de/fileadmin/pi/se/Stud-Arbeiten/2022/MA-Deters-2022.pdf>

Deters, H. L. (2022). Criteria and Metrics for the Explainability of Software. [Masterarbeit, Gottfried Wilhelm Leibniz Universität Hannover]. https://www.pi.uni-hannover.de/fileadmin/pi/se/Stud-Arbeiten/2022/MA-Deters-2022.pdf

Deters HL. Criteria and Metrics for the Explainability of Software. Hannover, 2022. 114 S.

Deters, Hannah Luca. / Criteria and Metrics for the Explainability of Software. Hannover, 2022. 114 S.

Download

@mastersthesis{bc2a1123353b435a9e6259b640015c3e,

title = "Criteria and Metrics for the Explainability of Software",

abstract = "In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master{\textquoteright}s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.",

author = "Deters, {Hannah Luca}",

year = "2022",

month = sep,

day = "28",

language = "English",

school = "Leibniz University Hannover",

}

Download

TY - THES

T1 - Criteria and Metrics for the Explainability of Software

AU - Deters, Hannah Luca

PY - 2022/9/28

Y1 - 2022/9/28

N2 - In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master’s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.

AB - In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master’s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.

M3 - Master's thesis

CY - Hannover

ER -

Research@Leibniz University

Criteria and Metrics for the Explainability of Software

Autorschaft

Organisationseinheiten

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Self-Elicitation of Requirements with Automated GUI Prototyping.

What you see is what you trace: a two-stage interview study on traceability practices and eye tracking potential

Paving the Way Towards an Effective Vision Video Usage: An Exploratory Study

Organizing Graphical User Interface tests from behavior‐driven development as videos to obtain stakeholders' feedback

Supporting Value-Aware Software Engineering Through Traceability and Value Tactics

Self-Elicitation of Requirements with Automated GUI Prototyping.

What you see is what you trace: a two-stage interview study on traceability practices and eye tracking potential

Paving the Way Towards an Effective Vision Video Usage: An Exploratory Study

Organizing Graphical User Interface tests from behavior‐driven development as videos to obtain stakeholders' feedback

Supporting Value-Aware Software Engineering Through Traceability and Value Tactics

Self-Elicitation of Requirements with Automated GUI Prototyping.