A Comparative Study for Unsupervised Network Representation Learning

Megha Khosla; Vinay Setty; Avishek Anand

doi:10.1109/TKDE.2019.2951398

Details

Originalsprache	Englisch
Seiten (von - bis)	1807-1818
Seitenumfang	12
Fachzeitschrift	IEEE Transactions on Knowledge and Data Engineering
Jahrgang	33
Ausgabenummer	5
Frühes Online-Datum	4 Nov. 2019
Publikationsstatus	Veröffentlicht - 1 Mai 2021

Abstract

There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

ASJC Scopus Sachgebiete

Informatik (insg.)
Information systems
Informatik (insg.)
Angewandte Informatik
Informatik (insg.)
Theoretische Informatik und Mathematik

Zitieren

A Comparative Study for Unsupervised Network Representation Learning. / Khosla, Megha; Setty, Vinay; Anand, Avishek.
in: IEEE Transactions on Knowledge and Data Engineering, Jahrgang 33, Nr. 5, 01.05.2021, S. 1807-1818.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Khosla, M, Setty, V & Anand, A 2021, 'A Comparative Study for Unsupervised Network Representation Learning', IEEE Transactions on Knowledge and Data Engineering, Jg. 33, Nr. 5, S. 1807-1818. https://doi.org/10.1109/TKDE.2019.2951398

Khosla, M., Setty, V., & Anand, A. (2021). A Comparative Study for Unsupervised Network Representation Learning. IEEE Transactions on Knowledge and Data Engineering, 33(5), 1807-1818. https://doi.org/10.1109/TKDE.2019.2951398

Khosla M, Setty V, Anand A. A Comparative Study for Unsupervised Network Representation Learning. IEEE Transactions on Knowledge and Data Engineering. 2021 Mai 1;33(5):1807-1818. Epub 2019 Nov 4. doi: 10.1109/TKDE.2019.2951398

Khosla, Megha ; Setty, Vinay ; Anand, Avishek. / A Comparative Study for Unsupervised Network Representation Learning. in: IEEE Transactions on Knowledge and Data Engineering. 2021 ; Jahrgang 33, Nr. 5. S. 1807-1818.

Download

@article{c3ef0cca7d684a759499dfb3e53ecd25,

title = "A Comparative Study for Unsupervised Network Representation Learning",

abstract = "There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results. ",

author = "Megha Khosla and Vinay Setty and Avishek Anand",

note = "Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).",

year = "2021",

month = may,

day = "1",

doi = "10.1109/TKDE.2019.2951398",

language = "English",

volume = "33",

pages = "1807--1818",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "5",

}

Download

TY - JOUR

T1 - A Comparative Study for Unsupervised Network Representation Learning

AU - Khosla, Megha

AU - Setty, Vinay

AU - Anand, Avishek

N1 - Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).

PY - 2021/5/1

Y1 - 2021/5/1

N2 - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

AB - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

UR - http://www.scopus.com/inward/record.url?scp=85104043134&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2019.2951398

DO - 10.1109/TKDE.2019.2951398

M3 - Article

AN - SCOPUS:85104043134

VL - 33

SP - 1807

EP - 1818

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 5

ER -

Research@Leibniz University

A Comparative Study for Unsupervised Network Representation Learning

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren