Details
Original language | English |
---|---|
Title of host publication | Multimodal Scene Understanding |
Subtitle of host publication | Algorithms, Applications and Deep Learning |
Editors | Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino |
Publisher | Elsevier |
Chapter | 1 |
Pages | 1-7 |
Number of pages | 7 |
ISBN (electronic) | 9780128173589 |
Publication status | Published - 2 Aug 2019 |
Abstract
A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.
Keywords
- Computer vision, Deep learning, Multimodality, Scene understanding
ASJC Scopus subject areas
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning. ed. / Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino. Elsevier, 2019. p. 1-7.
Research output: Chapter in book/report/conference proceeding › Contribution to book/anthology › Research › peer review
}
TY - CHAP
T1 - Introduction to multimodal scene understanding
AU - Yang, Michael Ying
AU - Rosenhahn, Bodo
AU - Murino, Vittorio
PY - 2019/8/2
Y1 - 2019/8/2
N2 - A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.
AB - A fundamental goal of computer vision is to discover the semantic information within a given scene, commonly referred to as scene understanding. The overall goal is to find a mapping to derive semantic information from sensor data, which is an extremely challenging task, partially due to the ambiguities in the appearance of the data. However, the majority of the scene understanding tasks tackled so far are mainly involving visual modalities only. In this book, we aim at providing an overview of recent advances in algorithms and applications that involve multiple sources of information for scene understanding. In this context, deep learning models are particularly suitable for combining multiple modalities and, as a matter of fact, many contributions are dealing with such architectures to take benefit of all data streams and obtain optimal performances. We conclude this book’s introduction by a concise description of the rest of the chapters therein contained. They are focused at providing an understanding of the state-of-the-art, open problems, and future directions related to multimodal scene understanding as a scientific discipline.
KW - Computer vision
KW - Deep learning
KW - Multimodality
KW - Scene understanding
UR - http://www.scopus.com/inward/record.url?scp=85082082135&partnerID=8YFLogxK
U2 - 10.1016/B978-0-12-817358-9.00007-X
DO - 10.1016/B978-0-12-817358-9.00007-X
M3 - Contribution to book/anthology
AN - SCOPUS:85082082135
SP - 1
EP - 7
BT - Multimodal Scene Understanding
A2 - Ying Yang, Michael
A2 - Rosenhahn, Bodo
A2 - Murino, Vittorio
PB - Elsevier
ER -