Exploiting Attention for Visual Relationship Detection

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

External Research Organisations

  • University of Twente
View graph of relations

Details

Original languageEnglish
Title of host publicationPattern Recognition
Subtitle of host publication41st DAGM German Conference, DAGM GCPR 2019, Proceedings
Place of PublicationCham
PublisherSpringer Nature
Pages331-344
Number of pages14
ISBN (print)9783030336752
Publication statusPublished - 25 Oct 2019
Event41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019 - Dortmund, Germany
Duration: 10 Sept 201913 Sept 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11824 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

ASJC Scopus subject areas

Cite this

Exploiting Attention for Visual Relationship Detection. / Hu, Tongxin; Liao, Wentong; Yang, Michael Ying et al.
Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham: Springer Nature, 2019. p. 331-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11824 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Hu, T, Liao, W, Yang, MY & Rosenhahn, B 2019, Exploiting Attention for Visual Relationship Detection. in Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11824 LNCS, Springer Nature, Cham, pp. 331-344, 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019, Dortmund, Germany, 10 Sept 2019. https://doi.org/10.1007/978-3-030-33676-9_23
Hu, T., Liao, W., Yang, M. Y., & Rosenhahn, B. (2019). Exploiting Attention for Visual Relationship Detection. In Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings (pp. 331-344). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11824 LNCS). Springer Nature. https://doi.org/10.1007/978-3-030-33676-9_23
Hu T, Liao W, Yang MY, Rosenhahn B. Exploiting Attention for Visual Relationship Detection. In Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham: Springer Nature. 2019. p. 331-344. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-33676-9_23
Hu, Tongxin ; Liao, Wentong ; Yang, Michael Ying et al. / Exploiting Attention for Visual Relationship Detection. Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Proceedings. Cham : Springer Nature, 2019. pp. 331-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{1e59d3112a694df08db00a2a1c38d7e2,
title = "Exploiting Attention for Visual Relationship Detection",
abstract = "Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.",
author = "Tongxin Hu and Wentong Liao and Yang, {Michael Ying} and Bodo Rosenhahn",
note = "Funding Information: We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMTCR, MPOCR and VSC CR, Czech Republic; DNRF, DNSRC and Lundbeck Foundation, Denmark; IN2P3-CNRS, CEADSM/ IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZ?, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, the Canada Council, CANARIE, CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7, Horizon 2020 and Marie Sk?odowska-Curie Actions, European Union; Investissements d?Avenir Labex and Idex, ANR, Region Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF; BSF, GIF and Minerva, Israel; BRF, Norway; the Royal Society and Leverhulme Trust, United Kingdom. The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN and theATLAS Tier- 1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) and BNL (USA) and in the Tier-2 facilities worldwide.; 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019 ; Conference date: 10-09-2019 Through 13-09-2019",
year = "2019",
month = oct,
day = "25",
doi = "10.1007/978-3-030-33676-9_23",
language = "English",
isbn = "9783030336752",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Nature",
pages = "331--344",
booktitle = "Pattern Recognition",
address = "United States",

}

Download

TY - GEN

T1 - Exploiting Attention for Visual Relationship Detection

AU - Hu, Tongxin

AU - Liao, Wentong

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

N1 - Funding Information: We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia; MSMTCR, MPOCR and VSC CR, Czech Republic; DNRF, DNSRC and Lundbeck Foundation, Denmark; IN2P3-CNRS, CEADSM/ IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC, Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS, Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland; FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD, Serbia; MSSR, Slovakia; ARRS and MIZ?, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST, Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition, individual groups and members have received support from BCKDF, the Canada Council, CANARIE, CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7, Horizon 2020 and Marie Sk?odowska-Curie Actions, European Union; Investissements d?Avenir Labex and Idex, ANR, Region Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation, Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF; BSF, GIF and Minerva, Israel; BRF, Norway; the Royal Society and Leverhulme Trust, United Kingdom. The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN and theATLAS Tier- 1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden), CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain), ASGC (Taiwan), RAL (UK) and BNL (USA) and in the Tier-2 facilities worldwide.

PY - 2019/10/25

Y1 - 2019/10/25

N2 - Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

AB - Visual relationship detection targets on predicting categories of predicates and object pairs, and also locating the object pairs. Recognizing the relationships between individual objects is important for describing visual scenes in static images. In this paper, we propose a novel end-to-end framework on the visual relationship detection task. First, we design a spatial attention model for specializing predicate features. Compared to a normal ROI-pooling layer, this structure significantly improves Predicate Classification performance. Second, for extracting relative spatial configuration, we propose to map simple geometric representations to a high dimension, which boosts relationship detection accuracy. Third, we implement a feature embedding model with a bi-directional RNN which considers subject, predicate and object as a time sequence. We evaluate our method on three tasks. The experiments demonstrate that our method achieves competitive results compared to state-of-the-art methods.

UR - http://www.scopus.com/inward/record.url?scp=85076175014&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-33676-9_23

DO - 10.1007/978-3-030-33676-9_23

M3 - Conference contribution

AN - SCOPUS:85076175014

SN - 9783030336752

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 331

EP - 344

BT - Pattern Recognition

PB - Springer Nature

CY - Cham

T2 - 41st DAGM German Conference on Pattern Recognition, DAGM GCPR 2019

Y2 - 10 September 2019 through 13 September 2019

ER -

By the same author(s)