On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding

Yi Hsin Chen; Kuan Wei Ho; Martin Benjak; Jorn Ostermann; Wen Hsiao Peng

doi:10.48550/arXiv.2410.03898

Details

Originalsprache	Englisch
Titel des Sammelwerks	2024 IEEE 26th International Workshop on Multimedia Signal Processing
Untertitel	MMSP 2024
Herausgeber (Verlag)	Institute of Electrical and Electronics Engineers Inc.
Seitenumfang	6
ISBN (elektronisch)	9798350387254
ISBN (Print)	979-8-3503-8726-1
Publikationsstatus	Veröffentlicht - 2 Okt. 2024
Veranstaltung	26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024 - West Lafayette, USA / Vereinigte Staaten Dauer: 2 Okt. 2024 → 4 Okt. 2024

Abstract

This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

ASJC Scopus Sachgebiete

Informatik (insg.)
Signalverarbeitung
Ingenieurwesen (insg.)
Medientechnik
Informatik (insg.)
Artificial intelligence
Informatik (insg.)
Computernetzwerke und -kommunikation
Informatik (insg.)
Maschinelles Sehen und Mustererkennung

Zitieren

On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. / Chen, Yi Hsin; Ho, Kuan Wei; Benjak, Martin et al.
2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Chen, YH, Ho, KW, Benjak, M, Ostermann, J & Peng, WH 2024, On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. in 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024, West Lafayette, USA / Vereinigte Staaten, 2 Okt. 2024. https://doi.org/10.48550/arXiv.2410.03898, https://doi.org/10.1109/MMSP61759.2024.10743250

Chen, Y. H., Ho, K. W., Benjak, M., Ostermann, J., & Peng, W. H. (2024). On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. In 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024 Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2410.03898, https://doi.org/10.1109/MMSP61759.2024.10743250

Chen YH, Ho KW, Benjak M, Ostermann J, Peng WH. On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. in 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc. 2024 doi: 10.48550/arXiv.2410.03898, 10.1109/MMSP61759.2024.10743250

Chen, Yi Hsin ; Ho, Kuan Wei ; Benjak, Martin et al. / On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding. 2024 IEEE 26th International Workshop on Multimedia Signal Processing: MMSP 2024. Institute of Electrical and Electronics Engineers Inc., 2024.

Download

@inproceedings{875bc22741c144ee8b0664b7433443d4,

title = "On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding",

abstract = "This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.",

keywords = "conditional coding, conditional residual coding, Learned video compression",

author = "Chen, {Yi Hsin} and Ho, {Kuan Wei} and Martin Benjak and Jorn Ostermann and Peng, {Wen Hsiao}",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024 ; Conference date: 02-10-2024 Through 04-10-2024",

year = "2024",

month = oct,

day = "2",

doi = "10.48550/arXiv.2410.03898",

language = "English",

isbn = "979-8-3503-8726-1",

booktitle = "2024 IEEE 26th International Workshop on Multimedia Signal Processing",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

Download

TY - GEN

T1 - On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding

AU - Chen, Yi Hsin

AU - Ho, Kuan Wei

AU - Benjak, Martin

AU - Ostermann, Jorn

AU - Peng, Wen Hsiao

PY - 2024/10/2

Y1 - 2024/10/2

N2 - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

AB - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.

KW - conditional coding

KW - conditional residual coding

KW - Learned video compression

UR - http://www.scopus.com/inward/record.url?scp=85211325094&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2410.03898

DO - 10.48550/arXiv.2410.03898

M3 - Conference contribution

AN - SCOPUS:85211325094

SN - 979-8-3503-8726-1

BT - 2024 IEEE 26th International Workshop on Multimedia Signal Processing

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024

Y2 - 2 October 2024 through 4 October 2024

ER -

Research@Leibniz University

On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Acoustic Emission Detection in Noisy Environments using Linear Prediction

Genie: the first open-source ISO/IEC encoder for genomic data

Matched Filter for Acoustic Emission Monitoring in Noisy Environments: Application to Wire Break Detection

Self-supervised domain adaptation for machinery remaining useful life prediction

MaskCRT: Masked Conditional Residual Transformer for Learned Video Compression

Acoustic Emission Detection in Noisy Environments using Linear Prediction

Genie: the first open-source ISO/IEC encoder for genomic data

Matched Filter for Acoustic Emission Monitoring in Noisy Environments: Application to Wire Break Detection

Self-supervised domain adaptation for machinery remaining useful life prediction

MaskCRT: Masked Conditional Residual Transformer for Learned Video Compression

Acoustic Emission Detection in Noisy Environments using Linear Prediction