Details
Original language | English |
---|---|
Pages (from-to) | 2275-2277 |
Number of pages | 3 |
Journal | BIOINFORMATICS |
Volume | 36 |
Issue number | 7 |
Early online date | 12 Dec 2019 |
Publication status | Published - 1 Apr 2020 |
Abstract
Motivation: In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data: the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec: GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic coding, binarization schemes and transformations, into a straightforward solution for the compression of sequencing data. Results: We demonstrate that GABAC outperforms well-established (entropy) codecs in a significant set of cases and thus can serve as an extension for existing genomic compression solutions, such as CRAM.
ASJC Scopus subject areas
- Mathematics(all)
- Statistics and Probability
- Biochemistry, Genetics and Molecular Biology(all)
- Biochemistry
- Biochemistry, Genetics and Molecular Biology(all)
- Molecular Biology
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Computational Theory and Mathematics
- Mathematics(all)
- Computational Mathematics
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: BIOINFORMATICS, Vol. 36, No. 7, 01.04.2020, p. 2275-2277.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - GABAC
T2 - An arithmetic coding solution for genomic data
AU - Voges, Jan
AU - Paridaens, Tom
AU - Müntefering, Fabian
AU - Mainzer, Liudmila S.
AU - Bliss, Brian
AU - Yang, Mingyu
AU - Ochoa, Idoia
AU - Fostier, Jan
AU - Ostermann, Jörn
AU - Hernaez, Mikel
N1 - Funding information: This work has been partially supported by grants 2018-182798 and 2018-182799 from the Chan Zuckerberg Initiative DAF, a donor advised fund of the Silicon Valley Community Foundation, a Strategic Research Initiative from UIUC and the Mayo Clinic Center for Individualized Medicine, and the Todd and Karen Wanek Program for Hypoplastic Left Heart Syndrome. This work was a part of the Mayo Clinic and Illinois Strategic Alliance for Technology-Based Healthcare.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - Motivation: In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data: the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec: GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic coding, binarization schemes and transformations, into a straightforward solution for the compression of sequencing data. Results: We demonstrate that GABAC outperforms well-established (entropy) codecs in a significant set of cases and thus can serve as an extension for existing genomic compression solutions, such as CRAM.
AB - Motivation: In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data: the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec: GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic coding, binarization schemes and transformations, into a straightforward solution for the compression of sequencing data. Results: We demonstrate that GABAC outperforms well-established (entropy) codecs in a significant set of cases and thus can serve as an extension for existing genomic compression solutions, such as CRAM.
UR - http://www.scopus.com/inward/record.url?scp=85083073632&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btz922
DO - 10.1093/bioinformatics/btz922
M3 - Article
C2 - 31830243
AN - SCOPUS:85083073632
VL - 36
SP - 2275
EP - 2277
JO - BIOINFORMATICS
JF - BIOINFORMATICS
SN - 1367-4803
IS - 7
ER -