Criteria and Metrics for the Explainability of Software

Research output: ThesisMaster's thesis

Research Organisations

View graph of relations

Details

Original languageEnglish
QualificationMaster of Science
Awarding Institution
Supervised by
Place of PublicationHannover
Publication statusPublished - 28 Sept 2022

Abstract

In this master thesis, a concept for the evaluation of explainability in software
systems was developed. For this purpose, a comprehensive literature review
was conducted in which 86 relevant papers were obtained from an initial
set of 1025 papers. These papers contributed to the conceptualization of
the evaluation method. During this conceptualization, it was found that
the characteristics of explainability are strongly linked to the objective that
the explanations are supposed to achieve. It became clear that it is not
possible to achieve a satisfactory result if the evaluation of explainability
does not take these objectives into account. What has also been noticed
is that the literature already provides methods for evaluating single aspects
of explainability, but these consist almost exclusively of user studies. Since
conducting multiple user studies would be unrealistically expensive in non-
research settings, heuristics were developed to provide a first estimate of
explainability. Overall, an overarching concept was developed that links the
definition of objectives, the initial assessment with heuristics, and the more
reliable evaluation with user studies.
In the second part of the master’s thesis, a user study was conducted to
evaluate whether the developed heuristics produce reliable results. For this
purpose, the interrater agreement was examined to see whether the heuristics
allow uniform ratings. It was found that a group of evaluators together can
produce a uniform result. Significance tests were then used to determine
whether the heuristics are capable of identifying significant differences in the
explainability of two systems. It was found that significant differences were
revealed within the different aspects of explainability.

ASJC Scopus subject areas

Cite this

Criteria and Metrics for the Explainability of Software. / Deters, Hannah Luca.
Hannover, 2022. 114 p.

Research output: ThesisMaster's thesis

Download
@mastersthesis{bc2a1123353b435a9e6259b640015c3e,
title = "Criteria and Metrics for the Explainability of Software",
abstract = "In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master{\textquoteright}s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.",
author = "Deters, {Hannah Luca}",
year = "2022",
month = sep,
day = "28",
language = "English",
school = "Leibniz University Hannover",

}

Download

TY - GEN

T1 - Criteria and Metrics for the Explainability of Software

AU - Deters, Hannah Luca

PY - 2022/9/28

Y1 - 2022/9/28

N2 - In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master’s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.

AB - In this master thesis, a concept for the evaluation of explainability in softwaresystems was developed. For this purpose, a comprehensive literature reviewwas conducted in which 86 relevant papers were obtained from an initialset of 1025 papers. These papers contributed to the conceptualization ofthe evaluation method. During this conceptualization, it was found thatthe characteristics of explainability are strongly linked to the objective thatthe explanations are supposed to achieve. It became clear that it is notpossible to achieve a satisfactory result if the evaluation of explainabilitydoes not take these objectives into account. What has also been noticedis that the literature already provides methods for evaluating single aspectsof explainability, but these consist almost exclusively of user studies. Sinceconducting multiple user studies would be unrealistically expensive in non-research settings, heuristics were developed to provide a first estimate ofexplainability. Overall, an overarching concept was developed that links thedefinition of objectives, the initial assessment with heuristics, and the morereliable evaluation with user studies.In the second part of the master’s thesis, a user study was conducted toevaluate whether the developed heuristics produce reliable results. For thispurpose, the interrater agreement was examined to see whether the heuristicsallow uniform ratings. It was found that a group of evaluators together canproduce a uniform result. Significance tests were then used to determinewhether the heuristics are capable of identifying significant differences in theexplainability of two systems. It was found that significant differences wererevealed within the different aspects of explainability.

M3 - Master's thesis

CY - Hannover

ER -

By the same author(s)