Details
Original language | English |
---|---|
Title of host publication | 2024 IEEE 26th International Workshop on Multimedia Signal Processing |
Subtitle of host publication | MMSP 2024 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Number of pages | 6 |
ISBN (electronic) | 9798350387254 |
ISBN (print) | 979-8-3503-8726-1 |
Publication status | Published - 2 Oct 2024 |
Event | 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024 - West Lafayette, United States Duration: 2 Oct 2024 → 4 Oct 2024 |
Abstract
This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.
Keywords
- conditional coding, conditional residual coding, Learned video compression
ASJC Scopus subject areas
- Computer Science(all)
- Signal Processing
- Engineering(all)
- Media Technology
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Computer Vision and Pattern Recognition
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding
AU - Chen, Yi Hsin
AU - Ho, Kuan Wei
AU - Benjak, Martin
AU - Ostermann, Jorn
AU - Peng, Wen Hsiao
N1 - Publisher Copyright: © 2024 IEEE.
PY - 2024/10/2
Y1 - 2024/10/2
N2 - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.
AB - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.
KW - conditional coding
KW - conditional residual coding
KW - Learned video compression
UR - http://www.scopus.com/inward/record.url?scp=85211325094&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2410.03898
DO - 10.48550/arXiv.2410.03898
M3 - Conference contribution
AN - SCOPUS:85211325094
SN - 979-8-3503-8726-1
BT - 2024 IEEE 26th International Workshop on Multimedia Signal Processing
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024
Y2 - 2 October 2024 through 4 October 2024
ER -