NODIS: Neural Ordinary Differential Scene Understanding

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

External Research Organisations

  • University of Twente
View graph of relations

Details

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020
Subtitle of host publication16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
Pages636-653
Number of pages18
ISBN (electronic)978-3-030-58565-5
Publication statusPublished - 14 Nov 2020
Event16th European Conference on Computer Vision
- Glasgow
Duration: 23 Aug 202028 Aug 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12365 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark.

Keywords

    cs.CV, Visual relationship detection, Scene graph, Semantic image understanding

ASJC Scopus subject areas

Cite this

NODIS: Neural Ordinary Differential Scene Understanding. / Yuren, Cong; Ackermann, Hanno; Liao, Wentong et al.
Computer Vision – ECCV 2020: 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX. ed. / Andrea Vedaldi; Horst Bischof; Thomas Brox; Jan-Michael Frahm. 2020. p. 636-653 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12365 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Yuren, C, Ackermann, H, Liao, W, Yang, MY & Rosenhahn, B 2020, NODIS: Neural Ordinary Differential Scene Understanding. in A Vedaldi, H Bischof, T Brox & J-M Frahm (eds), Computer Vision – ECCV 2020: 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12365 LNCS, pp. 636-653, 16th European Conference on Computer Vision
, Glasgow, 23 Aug 2020. https://doi.org/10.1007/978-3-030-58565-5_38
Yuren, C., Ackermann, H., Liao, W., Yang, M. Y., & Rosenhahn, B. (2020). NODIS: Neural Ordinary Differential Scene Understanding. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer Vision – ECCV 2020: 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX (pp. 636-653). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12365 LNCS). https://doi.org/10.1007/978-3-030-58565-5_38
Yuren C, Ackermann H, Liao W, Yang MY, Rosenhahn B. NODIS: Neural Ordinary Differential Scene Understanding. In Vedaldi A, Bischof H, Brox T, Frahm JM, editors, Computer Vision – ECCV 2020: 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX. 2020. p. 636-653. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2020. doi: 10.1007/978-3-030-58565-5_38
Yuren, Cong ; Ackermann, Hanno ; Liao, Wentong et al. / NODIS : Neural Ordinary Differential Scene Understanding. Computer Vision – ECCV 2020: 16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XX. editor / Andrea Vedaldi ; Horst Bischof ; Thomas Brox ; Jan-Michael Frahm. 2020. pp. 636-653 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{32f6fb4ae2a840b28083f52fb56cbd0b,
title = "NODIS: Neural Ordinary Differential Scene Understanding",
abstract = " Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark. ",
keywords = "cs.CV, Visual relationship detection, Scene graph, Semantic image understanding",
author = "Cong Yuren and Hanno Ackermann and Wentong Liao and Yang, {Michael Ying} and Bodo Rosenhahn",
note = "Funding Information: Acknowledgement. This work was partially supported by the DFG grant COVMAP (RO 2497/12-2) and EXC 2122.; 16th European Conference on Computer Vision<br/>, ECCV 2016 ; Conference date: 23-08-2020 Through 28-08-2020",
year = "2020",
month = nov,
day = "14",
doi = "10.1007/978-3-030-58565-5_38",
language = "English",
isbn = "978-3-030-58564-8",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "636--653",
editor = "Andrea Vedaldi and Horst Bischof and Thomas Brox and Jan-Michael Frahm",
booktitle = "Computer Vision – ECCV 2020",

}

Download

TY - GEN

T1 - NODIS

T2 - 16th European Conference on Computer Vision<br/>

AU - Yuren, Cong

AU - Ackermann, Hanno

AU - Liao, Wentong

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

N1 - Funding Information: Acknowledgement. This work was partially supported by the DFG grant COVMAP (RO 2497/12-2) and EXC 2122.

PY - 2020/11/14

Y1 - 2020/11/14

N2 - Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark.

AB - Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark.

KW - cs.CV

KW - Visual relationship detection

KW - Scene graph

KW - Semantic image understanding

UR - http://www.scopus.com/inward/record.url?scp=85097433441&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-58565-5_38

DO - 10.1007/978-3-030-58565-5_38

M3 - Conference contribution

SN - 978-3-030-58564-8

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 636

EP - 653

BT - Computer Vision – ECCV 2020

A2 - Vedaldi, Andrea

A2 - Bischof, Horst

A2 - Brox, Thomas

A2 - Frahm, Jan-Michael

Y2 - 23 August 2020 through 28 August 2020

ER -

By the same author(s)