Loading [MathJax]/extensions/tex2jax.js

A Comparative Study for Unsupervised Network Representation Learning

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Megha Khosla
  • Vinay Setty
  • Avishek Anand

Research Organisations

External Research Organisations

  • University of Stavanger

Details

Original languageEnglish
Pages (from-to)1807-1818
Number of pages12
JournalIEEE Transactions on Knowledge and Data Engineering
Volume33
Issue number5
Early online date4 Nov 2019
Publication statusPublished - 1 May 2021

Abstract

There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

ASJC Scopus subject areas

Cite this

A Comparative Study for Unsupervised Network Representation Learning. / Khosla, Megha; Setty, Vinay; Anand, Avishek.
In: IEEE Transactions on Knowledge and Data Engineering, Vol. 33, No. 5, 01.05.2021, p. 1807-1818.

Research output: Contribution to journalArticleResearchpeer review

Khosla M, Setty V, Anand A. A Comparative Study for Unsupervised Network Representation Learning. IEEE Transactions on Knowledge and Data Engineering. 2021 May 1;33(5):1807-1818. Epub 2019 Nov 4. doi: 10.1109/TKDE.2019.2951398
Khosla, Megha ; Setty, Vinay ; Anand, Avishek. / A Comparative Study for Unsupervised Network Representation Learning. In: IEEE Transactions on Knowledge and Data Engineering. 2021 ; Vol. 33, No. 5. pp. 1807-1818.
Download
@article{c3ef0cca7d684a759499dfb3e53ecd25,
title = "A Comparative Study for Unsupervised Network Representation Learning",
abstract = "There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results. ",
author = "Megha Khosla and Vinay Setty and Avishek Anand",
note = "Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).",
year = "2021",
month = may,
day = "1",
doi = "10.1109/TKDE.2019.2951398",
language = "English",
volume = "33",
pages = "1807--1818",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "5",

}

Download

TY - JOUR

T1 - A Comparative Study for Unsupervised Network Representation Learning

AU - Khosla, Megha

AU - Setty, Vinay

AU - Anand, Avishek

N1 - Funding Information: This work is partially funded by the SoBigData (European Unions Horizon 2020 Research and Innovation Programme under Grant agreement No. 654024).

PY - 2021/5/1

Y1 - 2021/5/1

N2 - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

AB - There has been significant progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives, and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. We argue that most of the UNRL approaches either model and exploit neighborhood or what we call context information of a node. These methods largely differ in their definitions and exploitation of context. Consequently, we propose a framework that casts a variety of approaches-random walk based, matrix factorization and deep learning based-into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study their differences which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering nine popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks-node classification and link prediction. We find that for non-attributed graphs there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition, we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

UR - http://www.scopus.com/inward/record.url?scp=85104043134&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2019.2951398

DO - 10.1109/TKDE.2019.2951398

M3 - Article

AN - SCOPUS:85104043134

VL - 33

SP - 1807

EP - 1818

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 5

ER -