A Comparative Study for Unsupervised Network Representation Learning

Megha Khosla; Vinay Setty; Avishek Anand

doi:10.1109/TKDE.2019.2951398

Details

Original language	English
Pages (from-to)	1807-1818
Number of pages	12
Journal	IEEE Transactions on Knowledge and Data Engineering
Volume	33
Issue number	5
Early online date	4 Nov 2019
Publication status	Published - 1 May 2021

Abstract

There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

ASJC Scopus subject areas

Computer Science(all)
Information Systems
Computer Science(all)
Computer Science Applications
Computer Science(all)
Computational Theory and Mathematics

Cite this

A Comparative Study for Unsupervised Network Representation Learning. / Khosla, Megha; Setty, Vinay; Anand, Avishek.
In: IEEE Transactions on Knowledge and Data Engineering, Vol. 33, No. 5, 01.05.2021, p. 1807-1818.

Research output: Contribution to journal › Article › Research › peer review

Khosla, M, Setty, V & Anand, A 2021, 'A Comparative Study for Unsupervised Network Representation Learning', IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 5, pp. 1807-1818. https://doi.org/10.1109/TKDE.2019.2951398

Khosla, M., Setty, V., & Anand, A. (2021). A Comparative Study for Unsupervised Network Representation Learning. IEEE Transactions on Knowledge and Data Engineering, 33(5), 1807-1818. https://doi.org/10.1109/TKDE.2019.2951398

Khosla M, Setty V, Anand A. A Comparative Study for Unsupervised Network Representation Learning. IEEE Transactions on Knowledge and Data Engineering. 2021 May 1;33(5):1807-1818. Epub 2019 Nov 4. doi: 10.1109/TKDE.2019.2951398

Khosla, Megha ; Setty, Vinay ; Anand, Avishek. / A Comparative Study for Unsupervised Network Representation Learning. In: IEEE Transactions on Knowledge and Data Engineering. 2021 ; Vol. 33, No. 5. pp. 1807-1818.

Download

@article{c3ef0cca7d684a759499dfb3e53ecd25,

title = "A Comparative Study for Unsupervised Network Representation Learning",

abstract = "There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results. ",

author = "Megha Khosla and Vinay Setty and Avishek Anand",

note = "Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).",

year = "2021",

month = may,

day = "1",

doi = "10.1109/TKDE.2019.2951398",

language = "English",

volume = "33",

pages = "1807--1818",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "5",

}

Download

TY - JOUR

T1 - A Comparative Study for Unsupervised Network Representation Learning

AU - Khosla, Megha

AU - Setty, Vinay

AU - Anand, Avishek

N1 - Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).

PY - 2021/5/1

Y1 - 2021/5/1

N2 - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

AB - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

UR - http://www.scopus.com/inward/record.url?scp=85104043134&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2019.2951398

DO - 10.1109/TKDE.2019.2951398

M3 - Article

AN - SCOPUS:85104043134

VL - 33

SP - 1807

EP - 1818

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 5

ER -

Research@Leibniz University

A Comparative Study for Unsupervised Network Representation Learning

Authors

Research Organisations

External Research Organisations

Details

Abstract

ASJC Scopus subject areas

Cite this