On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

External Research Organisations

  • National Yang Ming Chiao Tung University (NSTC)
View graph of relations

Details

Original languageEnglish
Title of host publication2024 IEEE 26th International Workshop on Multimedia Signal Processing
Subtitle of host publicationMMSP 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (electronic)9798350387254
ISBN (print)979-8-3503-8726-1
Publication statusPublished - 2 Oct 2024
Event26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024 - West Lafayette, United States
Duration: 2 Oct 20244 Oct 2024

Abstract

This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

Keywords

    conditional coding, conditional residual coding, Learned video compression

ASJC Scopus subject areas

Cite this

On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. / Chen, Yi Hsin; Ho, Kuan Wei; Benjak, Martin et al.
2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Chen, YH, Ho, KW, Benjak, M, Ostermann, J & Peng, WH 2024, On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. in 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024, West Lafayette, United States, 2 Oct 2024. https://doi.org/10.48550/arXiv.2410.03898, https://doi.org/10.1109/MMSP61759.2024.10743250
Chen, Y. H., Ho, K. W., Benjak, M., Ostermann, J., & Peng, W. H. (2024). On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. In 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024 Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2410.03898, https://doi.org/10.1109/MMSP61759.2024.10743250
Chen YH, Ho KW, Benjak M, Ostermann J, Peng WH. On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. In 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc. 2024 doi: 10.48550/arXiv.2410.03898, 10.1109/MMSP61759.2024.10743250
Chen, Yi Hsin ; Ho, Kuan Wei ; Benjak, Martin et al. / On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024.
Download
@inproceedings{875bc22741c144ee8b0664b7433443d4,
title = "On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding",
abstract = "This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.",
keywords = "conditional coding, conditional residual coding, Learned video compression",
author = "Chen, {Yi Hsin} and Ho, {Kuan Wei} and Martin Benjak and Jorn Ostermann and Peng, {Wen Hsiao}",
note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024 ; Conference date: 02-10-2024 Through 04-10-2024",
year = "2024",
month = oct,
day = "2",
doi = "10.48550/arXiv.2410.03898",
language = "English",
isbn = "979-8-3503-8726-1",
booktitle = "2024 IEEE 26th International Workshop on Multimedia Signal Processing",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Download

TY - GEN

T1 - On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding

AU - Chen, Yi Hsin

AU - Ho, Kuan Wei

AU - Benjak, Martin

AU - Ostermann, Jorn

AU - Peng, Wen Hsiao

N1 - Publisher Copyright: © 2024 IEEE.

PY - 2024/10/2

Y1 - 2024/10/2

N2 - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

AB - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

KW - conditional coding

KW - conditional residual coding

KW - Learned video compression

UR - http://www.scopus.com/inward/record.url?scp=85211325094&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2410.03898

DO - 10.48550/arXiv.2410.03898

M3 - Conference contribution

AN - SCOPUS:85211325094

SN - 979-8-3503-8726-1

BT - 2024 IEEE 26th International Workshop on Multimedia Signal Processing

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024

Y2 - 2 October 2024 through 4 October 2024

ER -

By the same author(s)