Introduction to multimodal scene understanding

Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino

doi:10.1016/B978-0-12-817358-9.00007-X

Details

Originalsprache	Englisch
Titel des Sammelwerks	Multimodal Scene Understanding
Untertitel	Algorithms, Applications and Deep Learning
Herausgeber/-innen	Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino
Herausgeber (Verlag)	Elsevier
Kapitel	1
Seiten	1-7
Seitenumfang	7
ISBN (elektronisch)	9780128173589
Publikationsstatus	Veröffentlicht - 2 Aug. 2019

Abstract

A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.

ASJC Scopus Sachgebiete

Informatik (insg.)
Allgemeine Computerwissenschaft

Zitieren

Introduction to multimodal scene understanding. / Yang, Michael Ying; Rosenhahn, Bodo; Murino, Vittorio.
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Hrsg. / Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino. Elsevier, 2019. S. 1-7.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Beitrag in Buch/Sammelwerk › Forschung › Peer-Review

Yang, MY, Rosenhahn, B & Murino, V 2019, Introduction to multimodal scene understanding. in M Ying Yang, B Rosenhahn & V Murino (Hrsg.), Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Elsevier, S. 1-7. https://doi.org/10.1016/B978-0-12-817358-9.00007-X

Yang, M. Y., Rosenhahn, B., & Murino, V. (2019). Introduction to multimodal scene understanding. In M. Ying Yang, B. Rosenhahn, & V. Murino (Hrsg.), Multimodal Scene Understanding: Algorithms, Applications and Deep Learning (S. 1-7). Elsevier. https://doi.org/10.1016/B978-0-12-817358-9.00007-X

Yang MY, Rosenhahn B, Murino V. Introduction to multimodal scene understanding. in Ying Yang M, Rosenhahn B, Murino V, Hrsg., Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Elsevier. 2019. S. 1-7 doi: 10.1016/B978-0-12-817358-9.00007-X

Yang, Michael Ying ; Rosenhahn, Bodo ; Murino, Vittorio. / Introduction to multimodal scene understanding. Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. Hrsg. / Michael Ying Yang ; Bodo Rosenhahn ; Vittorio Murino. Elsevier, 2019. S. 1-7

Download

@inbook{ada09e5d73254181bd8a138977f4087c,

title = "Introduction to multimodal scene understanding",

abstract = "A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book{\textquoteright}s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.",

keywords = "Computer vision, Deep learning, Multimodality, Scene understanding",

author = "Yang, {Michael Ying} and Bodo Rosenhahn and Vittorio Murino",

year = "2019",

month = aug,

day = "2",

doi = "10.1016/B978-0-12-817358-9.00007-X",

language = "English",

pages = "1--7",

editor = "{Ying Yang}, Michael and Bodo Rosenhahn and Vittorio Murino",

booktitle = "Multimodal Scene Understanding",

publisher = "Elsevier",

address = "Netherlands",

}

Download

TY - CHAP

T1 - Introduction to multimodal scene understanding

AU - Yang, Michael Ying

AU - Rosenhahn, Bodo

AU - Murino, Vittorio

PY - 2019/8/2

Y1 - 2019/8/2

N2 - A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.

AB - A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.

KW - Computer vision

KW - Deep learning

KW - Multimodality

KW - Scene understanding

UR - http://www.scopus.com/inward/record.url?scp=85082082135&partnerID=8YFLogxK

U2 - 10.1016/B978-0-12-817358-9.00007-X

DO - 10.1016/B978-0-12-817358-9.00007-X

M3 - Contribution to book/anthology

AN - SCOPUS:85082082135

SP - 1

EP - 7

BT - Multimodal Scene Understanding

A2 - Ying Yang, Michael

A2 - Rosenhahn, Bodo

A2 - Murino, Vittorio

PB - Elsevier

ER -

Research@Leibniz University

Introduction to multimodal scene understanding

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction

Quantum normalizing flows for anomaly detection

A variational autoencoder trained with priors from canonical pathways increases the interpretability of transcriptome data

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change

Robust Shape Fitting for 3D Scene Abstraction