SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs

Hao Huang; Maria Esther Vidal

doi:10.1007/978-981-96-0567-5_33

Details

Original language	English
Title of host publication	Web Information Systems Engineering
Subtitle of host publication	WISE 2024 - 25th International Conference, Proceedings
Editors	Mahmoud Barhamgi, Hua Wang, Xin Wang
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	467-483
Number of pages	17
ISBN (electronic)	978-981-96-0567-5
ISBN (print)	9789819605668
Publication status	Published - 2025
Event	25th International Conference on Web Information Systems Engineering, WISE 2024 - Doha, Qatar Duration: 2 Dec 2024 → 5 Dec 2024

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	15437 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.

Keywords

Causal Inference, Knowledge Graph, Matching, Semantics

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs. / Huang, Hao; Vidal, Maria Esther.
Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings. ed. / Mahmoud Barhamgi; Hua Wang; Xin Wang. Springer Science and Business Media Deutschland GmbH, 2025. p. 467-483 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 15437 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Huang, H & Vidal, ME 2025, SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs. in M Barhamgi, H Wang & X Wang (eds), Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 15437 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 467-483, 25th International Conference on Web Information Systems Engineering, WISE 2024, Doha, Qatar, 2 Dec 2024. https://doi.org/10.1007/978-981-96-0567-5_33

Huang, H., & Vidal, M. E. (2025). SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs. In M. Barhamgi, H. Wang, & X. Wang (Eds.), Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings (pp. 467-483). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 15437 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-96-0567-5_33

Huang H, Vidal ME. SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs. In Barhamgi M, Wang H, Wang X, editors, Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. p. 467-483. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2024 Dec 3. doi: 10.1007/978-981-96-0567-5_33

Huang, Hao ; Vidal, Maria Esther. / SemMatch : Semantics-Aware Matching for Causal Inference over Knowledge Graphs. Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings. editor / Mahmoud Barhamgi ; Hua Wang ; Xin Wang. Springer Science and Business Media Deutschland GmbH, 2025. pp. 467-483 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{2c12fa9a23c74b44a5f62f16f64d1ebc,

title = "SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs",

abstract = "Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.",

keywords = "Causal Inference, Knowledge Graph, Matching, Semantics",

author = "Hao Huang and Vidal, {Maria Esther}",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.; 25th International Conference on Web Information Systems Engineering, WISE 2024 ; Conference date: 02-12-2024 Through 05-12-2024",

year = "2025",

doi = "10.1007/978-981-96-0567-5_33",

language = "English",

isbn = "9789819605668",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "467--483",

editor = "Mahmoud Barhamgi and Hua Wang and Xin Wang",

booktitle = "Web Information Systems Engineering",

address = "Germany",

}

Download

TY - GEN

T1 - SemMatch

T2 - 25th International Conference on Web Information Systems Engineering, WISE 2024

AU - Huang, Hao

AU - Vidal, Maria Esther

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

PY - 2025

Y1 - 2025

N2 - Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.

AB - Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.

KW - Causal Inference

KW - Knowledge Graph

KW - Matching

KW - Semantics

UR - http://www.scopus.com/inward/record.url?scp=85211921518&partnerID=8YFLogxK

U2 - 10.1007/978-981-96-0567-5_33

DO - 10.1007/978-981-96-0567-5_33

M3 - Conference contribution

AN - SCOPUS:85211921518

SN - 9789819605668

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 467

EP - 483

BT - Web Information Systems Engineering

A2 - Barhamgi, Mahmoud

A2 - Wang, Hua

A2 - Wang, Xin

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 2 December 2024 through 5 December 2024

ER -

Research@Leibniz University

SemMatch: Semantics-Aware Matching for Causal Inference over Knowledge Graphs

Authors

Research Organisations

External Research Organisations