Exploiting Attention for Visual Relationship Detection

Tongxin Hu; Wentong Liao; Michael Ying Yang; Bodo Rosenhahn

doi:10.1007/978-3-030-33676-9_23

Details

Original language	English
Title of host publication	Pattern Recognition
Subtitle of host publication	41st DAGM German Conference, DAGM GCPR 2019, Proceedings
Place of Publication	Cham
Publisher	Springer Nature
Pages	331-344
Number of pages	14
ISBN (print)	9783030336752
Publication status	Published - 25 Oct 2019
Event	41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019 - Dortmund, Germany Duration: 10 Sept 2019 → 13 Sept 2019

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11824 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

Exploiting Attention for Visual Relationship Detection. / Hu, Tongxin; Liao, Wentong; Yang, Michael Ying et al.
Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham: Springer Nature, 2019. p. 331-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11824 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Hu, T, Liao, W, Yang, MY & Rosenhahn, B 2019, Exploiting Attention for Visual Relationship Detection. in Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11824 LNCS, Springer Nature, Cham, pp. 331-344, 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019, Dortmund, Germany, 10 Sept 2019. https://doi.org/10.1007/978-3-030-33676-9_23

Hu, T., Liao, W., Yang, M. Y., & Rosenhahn, B. (2019). Exploiting Attention for Visual Relationship Detection. In Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings (pp. 331-344). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11824 LNCS). Springer Nature. https://doi.org/10.1007/978-3-030-33676-9_23

Hu T, Liao W, Yang MY, Rosenhahn B. Exploiting Attention for Visual Relationship Detection. In Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham: Springer Nature. 2019. p. 331-344. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-33676-9_23

Hu, Tongxin ; Liao, Wentong ; Yang, Michael Ying et al. / Exploiting Attention for Visual Relationship Detection. Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham : Springer Nature, 2019. pp. 331-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Download

@inproceedings{1e59d3112a694df08db00a2a1c38d7e2,

title = "Exploiting Attention for Visual Relationship Detection",

abstract = "Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.",

author = "Tongxin Hu and Wentong Liao and Yang, {Michael Ying} and Bodo Rosenhahn",

note = "Funding Information: We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMTCR, MPOCR and VSC CR, Czech Republic; DNRF, DNSRC and Lundbeck Foundation, Denmark; IN2P3-CNRS, CEADSM/ IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZ?, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, the Canada Council, CANARIE, CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7, Horizon 2020 and Marie Sk?odowska-Curie Actions, European Union; Investissements d?Avenir Labex and Idex, ANR, Region Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF; BSF, GIF and Minerva, Israel; BRF, Norway; the Royal Society and Leverhulme Trust, United Kingdom. The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN and theATLAS Tier- 1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) and BNL (USA) and in the Tier-2 facilities worldwide.; 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019 ; Conference date: 10-09-2019 Through 13-09-2019",

year = "2019",

month = oct,

day = "25",

doi = "10.1007/978-3-030-33676-9_23",

language = "English",

isbn = "9783030336752",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Nature",

pages = "331--344",

booktitle = "Pattern Recognition",

address = "United States",

}

Download

TY - GEN

T1 - Exploiting Attention for Visual Relationship Detection

AU - Hu, Tongxin

AU - Liao, Wentong

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

N1 - Funding Information: We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMTCR, MPOCR and VSC CR, Czech Republic; DNRF, DNSRC and Lundbeck Foundation, Denmark; IN2P3-CNRS, CEADSM/ IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZ?, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, the Canada Council, CANARIE, CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7, Horizon 2020 and Marie Sk?odowska-Curie Actions, European Union; Investissements d?Avenir Labex and Idex, ANR, Region Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF; BSF, GIF and Minerva, Israel; BRF, Norway; the Royal Society and Leverhulme Trust, United Kingdom. The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN and theATLAS Tier- 1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) and BNL (USA) and in the Tier-2 facilities worldwide.

PY - 2019/10/25

Y1 - 2019/10/25

N2 - Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

AB - Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

UR - http://www.scopus.com/inward/record.url?scp=85076175014&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-33676-9_23

DO - 10.1007/978-3-030-33676-9_23

M3 - Conference contribution

AN - SCOPUS:85076175014

SN - 9783030336752

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 331

EP - 344

BT - Pattern Recognition

PB - Springer Nature

CY - Cham

T2 - 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019

Y2 - 10 September 2019 through 13 September 2019

ER -

Research@Leibniz University

Exploiting Attention for Visual Relationship Detection

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction