Loading [MathJax]/extensions/tex2jax.js

Entity Matching Across Small Networks Using Node Attributes

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Zahra Ahmadi
  • Zijian Zhang
  • Hoang H. Nguyen
  • Sergio Burdisso
  • Daniel Kudenko

Research Organisations

External Research Organisations

  • IDIAP Research Institute
  • HENSOLDT Analytics
  • Phonexia

Details

Original languageEnglish
Title of host publicationECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024
EditorsUlle Endriss, Francisco S. Melo, Kerstin Bach, Alberto Bugarin-Diz, Jose M. Alonso-Moral, Senen Barro, Fredrik Heintz
Pages4602-4609
Number of pages8
ISBN (electronic)9781643685489
Publication statusPublished - 19 Oct 2024
Event27th European Conference on Artificial Intelligence, ECAI 2024 - Santiago de Compostela, Spain
Duration: 19 Oct 202424 Oct 2024

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume392
ISSN (Print)0922-6389
ISSN (electronic)1879-8314

Abstract

Entity matching, also known as user identity linkage, is a critical task in data integration. While established techniques primarily focus on large-scale networks, there are several applications where small networks pose challenges due to limited training data and sparsity. This study addresses entity matching in the field of criminology, where small networks are common and the number of known matching nodes is restricted. To support this research, we exploit a multimodal dataset, collected as part of a security-related project, consisting of an intercepted telephone calls network (i.e., ROXSD data) and a network of social forum interactions (i.e., ROXHOOD data) collected in a simulated environment, although following real investigation scenario. To improve accuracy and efficiency, we propose a novel approach for entity matching across these two small networks using node attributes. Existing techniques often merely focus on topology consistency between two networks and overlook valuable information, such as network node attributes, making them vulnerable to structural changes. Inspired by the remarkable success of deep learning, we present UGC-DeepLink, an end-to-end semi-supervised learning framework that leverages user-generated content. UGC-DeepLink encodes network nodes into vector representations, capturing both local and global network structures to align anchor nodes using deep neural networks. A dual learning paradigm and the policy gradient method transfer knowledge and update the linkage. Additionally, node attributes, such as call contents and forum exchanged texts, enhance the ranking of matching nodes. Experimental results on ROXSD and ROXHOOD demonstrate that UGC-DeepLink surpasses baselines and state-of-the-art methods in terms of identity-match ranking. The code and dataset are available at https://github.com/erichoang/UGC-DeepLink.

ASJC Scopus subject areas

Cite this

Entity Matching Across Small Networks Using Node Attributes. / Ahmadi, Zahra; Zhang, Zijian; Nguyen, Hoang H. et al.
ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024. ed. / Ulle Endriss; Francisco S. Melo; Kerstin Bach; Alberto Bugarin-Diz; Jose M. Alonso-Moral; Senen Barro; Fredrik Heintz. 2024. p. 4602-4609 (Frontiers in Artificial Intelligence and Applications; Vol. 392).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Ahmadi, Z, Zhang, Z, Nguyen, HH, Burdisso, S, Madikeri, S, Motlicek, P, Dikici, E, Backfried, G, Kovac, M, Maly, K & Kudenko, D 2024, Entity Matching Across Small Networks Using Node Attributes. in U Endriss, FS Melo, K Bach, A Bugarin-Diz, JM Alonso-Moral, S Barro & F Heintz (eds), ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024. Frontiers in Artificial Intelligence and Applications, vol. 392, pp. 4602-4609, 27th European Conference on Artificial Intelligence, ECAI 2024, Santiago de Compostela, Spain, 19 Oct 2024. https://doi.org/10.3233/FAIA241054
Ahmadi, Z., Zhang, Z., Nguyen, H. H., Burdisso, S., Madikeri, S., Motlicek, P., Dikici, E., Backfried, G., Kovac, M., Maly, K., & Kudenko, D. (2024). Entity Matching Across Small Networks Using Node Attributes. In U. Endriss, F. S. Melo, K. Bach, A. Bugarin-Diz, J. M. Alonso-Moral, S. Barro, & F. Heintz (Eds.), ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024 (pp. 4602-4609). (Frontiers in Artificial Intelligence and Applications; Vol. 392). https://doi.org/10.3233/FAIA241054
Ahmadi Z, Zhang Z, Nguyen HH, Burdisso S, Madikeri S, Motlicek P et al. Entity Matching Across Small Networks Using Node Attributes. In Endriss U, Melo FS, Bach K, Bugarin-Diz A, Alonso-Moral JM, Barro S, Heintz F, editors, ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024. 2024. p. 4602-4609. (Frontiers in Artificial Intelligence and Applications). doi: 10.3233/FAIA241054
Ahmadi, Zahra ; Zhang, Zijian ; Nguyen, Hoang H. et al. / Entity Matching Across Small Networks Using Node Attributes. ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024. editor / Ulle Endriss ; Francisco S. Melo ; Kerstin Bach ; Alberto Bugarin-Diz ; Jose M. Alonso-Moral ; Senen Barro ; Fredrik Heintz. 2024. pp. 4602-4609 (Frontiers in Artificial Intelligence and Applications).
Download
@inproceedings{58398fe1d32c42cfad07d8c9f760c1c2,
title = "Entity Matching Across Small Networks Using Node Attributes",
abstract = "Entity matching, also known as user identity linkage, is a critical task in data integration. While established techniques primarily focus on large-scale networks, there are several applications where small networks pose challenges due to limited training data and sparsity. This study addresses entity matching in the field of criminology, where small networks are common and the number of known matching nodes is restricted. To support this research, we exploit a multimodal dataset, collected as part of a security-related project, consisting of an intercepted telephone calls network (i.e., ROXSD data) and a network of social forum interactions (i.e., ROXHOOD data) collected in a simulated environment, although following real investigation scenario. To improve accuracy and efficiency, we propose a novel approach for entity matching across these two small networks using node attributes. Existing techniques often merely focus on topology consistency between two networks and overlook valuable information, such as network node attributes, making them vulnerable to structural changes. Inspired by the remarkable success of deep learning, we present UGC-DeepLink, an end-to-end semi-supervised learning framework that leverages user-generated content. UGC-DeepLink encodes network nodes into vector representations, capturing both local and global network structures to align anchor nodes using deep neural networks. A dual learning paradigm and the policy gradient method transfer knowledge and update the linkage. Additionally, node attributes, such as call contents and forum exchanged texts, enhance the ranking of matching nodes. Experimental results on ROXSD and ROXHOOD demonstrate that UGC-DeepLink surpasses baselines and state-of-the-art methods in terms of identity-match ranking. The code and dataset are available at https://github.com/erichoang/UGC-DeepLink.",
author = "Zahra Ahmadi and Zijian Zhang and Nguyen, {Hoang H.} and Sergio Burdisso and Srikanth Madikeri and Petr Motlicek and Erinc Dikici and Gerhard Backfried and Marek Kovac and Kv{\v e}toslav Maly and Daniel Kudenko",
note = "Publisher Copyright: {\textcopyright} 2024 The Authors.; 27th European Conference on Artificial Intelligence, ECAI 2024 ; Conference date: 19-10-2024 Through 24-10-2024",
year = "2024",
month = oct,
day = "19",
doi = "10.3233/FAIA241054",
language = "English",
series = "Frontiers in Artificial Intelligence and Applications",
pages = "4602--4609",
editor = "Ulle Endriss and Melo, {Francisco S.} and Kerstin Bach and Alberto Bugarin-Diz and Alonso-Moral, {Jose M.} and Senen Barro and Fredrik Heintz",
booktitle = "ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024",

}

Download

TY - GEN

T1 - Entity Matching Across Small Networks Using Node Attributes

AU - Ahmadi, Zahra

AU - Zhang, Zijian

AU - Nguyen, Hoang H.

AU - Burdisso, Sergio

AU - Madikeri, Srikanth

AU - Motlicek, Petr

AU - Dikici, Erinc

AU - Backfried, Gerhard

AU - Kovac, Marek

AU - Maly, Květoslav

AU - Kudenko, Daniel

N1 - Publisher Copyright: © 2024 The Authors.

PY - 2024/10/19

Y1 - 2024/10/19

N2 - Entity matching, also known as user identity linkage, is a critical task in data integration. While established techniques primarily focus on large-scale networks, there are several applications where small networks pose challenges due to limited training data and sparsity. This study addresses entity matching in the field of criminology, where small networks are common and the number of known matching nodes is restricted. To support this research, we exploit a multimodal dataset, collected as part of a security-related project, consisting of an intercepted telephone calls network (i.e., ROXSD data) and a network of social forum interactions (i.e., ROXHOOD data) collected in a simulated environment, although following real investigation scenario. To improve accuracy and efficiency, we propose a novel approach for entity matching across these two small networks using node attributes. Existing techniques often merely focus on topology consistency between two networks and overlook valuable information, such as network node attributes, making them vulnerable to structural changes. Inspired by the remarkable success of deep learning, we present UGC-DeepLink, an end-to-end semi-supervised learning framework that leverages user-generated content. UGC-DeepLink encodes network nodes into vector representations, capturing both local and global network structures to align anchor nodes using deep neural networks. A dual learning paradigm and the policy gradient method transfer knowledge and update the linkage. Additionally, node attributes, such as call contents and forum exchanged texts, enhance the ranking of matching nodes. Experimental results on ROXSD and ROXHOOD demonstrate that UGC-DeepLink surpasses baselines and state-of-the-art methods in terms of identity-match ranking. The code and dataset are available at https://github.com/erichoang/UGC-DeepLink.

AB - Entity matching, also known as user identity linkage, is a critical task in data integration. While established techniques primarily focus on large-scale networks, there are several applications where small networks pose challenges due to limited training data and sparsity. This study addresses entity matching in the field of criminology, where small networks are common and the number of known matching nodes is restricted. To support this research, we exploit a multimodal dataset, collected as part of a security-related project, consisting of an intercepted telephone calls network (i.e., ROXSD data) and a network of social forum interactions (i.e., ROXHOOD data) collected in a simulated environment, although following real investigation scenario. To improve accuracy and efficiency, we propose a novel approach for entity matching across these two small networks using node attributes. Existing techniques often merely focus on topology consistency between two networks and overlook valuable information, such as network node attributes, making them vulnerable to structural changes. Inspired by the remarkable success of deep learning, we present UGC-DeepLink, an end-to-end semi-supervised learning framework that leverages user-generated content. UGC-DeepLink encodes network nodes into vector representations, capturing both local and global network structures to align anchor nodes using deep neural networks. A dual learning paradigm and the policy gradient method transfer knowledge and update the linkage. Additionally, node attributes, such as call contents and forum exchanged texts, enhance the ranking of matching nodes. Experimental results on ROXSD and ROXHOOD demonstrate that UGC-DeepLink surpasses baselines and state-of-the-art methods in terms of identity-match ranking. The code and dataset are available at https://github.com/erichoang/UGC-DeepLink.

UR - http://www.scopus.com/inward/record.url?scp=85216652739&partnerID=8YFLogxK

U2 - 10.3233/FAIA241054

DO - 10.3233/FAIA241054

M3 - Conference contribution

AN - SCOPUS:85216652739

T3 - Frontiers in Artificial Intelligence and Applications

SP - 4602

EP - 4609

BT - ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024

A2 - Endriss, Ulle

A2 - Melo, Francisco S.

A2 - Bach, Kerstin

A2 - Bugarin-Diz, Alberto

A2 - Alonso-Moral, Jose M.

A2 - Barro, Senen

A2 - Heintz, Fredrik

T2 - 27th European Conference on Artificial Intelligence, ECAI 2024

Y2 - 19 October 2024 through 24 October 2024

ER -