An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autoren

Externe Organisationen

  • University of Illinois Urbana-Champaign (UIUC)
  • University of Navarra
  • Eidgenössische Technische Hochschule Lausanne (ETHL)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Aufsatznummer9455132
Seiten (von - bis)1607-1622
Seitenumfang16
FachzeitschriftProc. IEEE
Jahrgang109
Ausgabenummer9
PublikationsstatusVeröffentlicht - Sept. 2021

Abstract

The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 'Biotechnology' has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.

ASJC Scopus Sachgebiete

Ziele für nachhaltige Entwicklung

Zitieren

An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data. / Voges, Jan; Hernaez, Mikel; Mattavelli, Marco et al.
in: Proc. IEEE, Jahrgang 109, Nr. 9, 9455132, 09.2021, S. 1607-1622.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Voges J, Hernaez M, Mattavelli M, Ostermann J. An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data. Proc. IEEE. 2021 Sep;109(9):1607-1622. 9455132. doi: 10.1109/JPROC.2021.3082027
Voges, Jan ; Hernaez, Mikel ; Mattavelli, Marco et al. / An Introduction to MPEG-G : The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data. in: Proc. IEEE. 2021 ; Jahrgang 109, Nr. 9. S. 1607-1622.
Download
@article{fd403181a252400b8cc2c37215f0a746,
title = "An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data.",
abstract = "The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 'Biotechnology' has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.",
keywords = "Bioinformatics, computational biology, data compression, DNA, genomics, standardization",
author = "Jan Voges and Mikel Hernaez and Marco Mattavelli and J{\"o}rn Ostermann",
note = "Acknowledgment: The development of the MPEG-G specification is a collaborative effort. The following people contributed to the actual MPEG-G development: Junaid J. Ahmad, Claudio Alberti, Simone Casale-Brunet, Patrick Cheung, Jaime Delgado, Jan Fostier, Silvia Llorente, Liud- mila S. Mainzer, Fabian M{\"u}ntefering, Daniel Naro, Ibrahim Numanagi{\' }c, Idoia Ochoa, Tom Paridaens, Massimo Ravasi, Daniele Renzi, Paolo Ribeca, and Giorgio Zoia. MPEG received additional input from other experts, including Bonnie Berger, Noah Daniels, Nicolas Guex, Christian Iseli, Raymond Krasinski, Christian Rohlfing, S. Cenk Sahinalp, and Ioannis Xenarios.",
year = "2021",
month = sep,
doi = "10.1109/JPROC.2021.3082027",
language = "English",
volume = "109",
pages = "1607--1622",
journal = "Proc. IEEE",
issn = "1558-2256",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "9",

}

Download

TY - JOUR

T1 - An Introduction to MPEG-G

T2 - The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data.

AU - Voges, Jan

AU - Hernaez, Mikel

AU - Mattavelli, Marco

AU - Ostermann, Jörn

N1 - Acknowledgment: The development of the MPEG-G specification is a collaborative effort. The following people contributed to the actual MPEG-G development: Junaid J. Ahmad, Claudio Alberti, Simone Casale-Brunet, Patrick Cheung, Jaime Delgado, Jan Fostier, Silvia Llorente, Liud- mila S. Mainzer, Fabian Müntefering, Daniel Naro, Ibrahim Numanagi ́c, Idoia Ochoa, Tom Paridaens, Massimo Ravasi, Daniele Renzi, Paolo Ribeca, and Giorgio Zoia. MPEG received additional input from other experts, including Bonnie Berger, Noah Daniels, Nicolas Guex, Christian Iseli, Raymond Krasinski, Christian Rohlfing, S. Cenk Sahinalp, and Ioannis Xenarios.

PY - 2021/9

Y1 - 2021/9

N2 - The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 'Biotechnology' has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.

AB - The development and progress of high-throughput sequencing technologies have transformed the sequencing of DNA from a scientific research challenge to practice. With the release of the latest generation of sequencing machines, the cost of sequencing a whole human genome has dropped to less than 600. Such achievements open the door to personalized medicine, where it is expected that genomic information of patients will be analyzed as a standard practice. However, the associated costs, related to storing, transmitting, and processing the large volumes of data, are already comparable to the costs of sequencing. To support the design of new and interoperable solutions for the representation, compression, and management of genomic sequencing data, the Moving Picture Experts Group (MPEG) jointly with working group 5 of ISO/TC276 'Biotechnology' has started to produce the ISO/IEC 23092 series, known as MPEG-G. MPEG-G does not only offer higher levels of compression compared with the state of the art but it also provides new functionalities, such as built-in support for random access in the compressed domain, support for data protection mechanisms, flexible storage, and streaming capabilities. MPEG-G only specifies the decoding syntax of compressed bitstreams, as well as a file format and a transport format. This allows for the development of new encoding solutions with higher degrees of optimization while maintaining compatibility with any existing MPEG-G decoder.

KW - Bioinformatics

KW - computational biology

KW - data compression

KW - DNA

KW - genomics

KW - standardization

UR - http://www.scopus.com/inward/record.url?scp=85112220633&partnerID=8YFLogxK

U2 - 10.1109/JPROC.2021.3082027

DO - 10.1109/JPROC.2021.3082027

M3 - Article

VL - 109

SP - 1607

EP - 1622

JO - Proc. IEEE

JF - Proc. IEEE

SN - 1558-2256

IS - 9

M1 - 9455132

ER -

Von denselben Autoren