Details
Originalsprache | Englisch |
---|---|
Aufsatznummer | 296 |
Seitenumfang | 15 |
Fachzeitschrift | BMC BIOINFORMATICS |
Jahrgang | 25 |
Ausgabenummer | 1 |
Frühes Online-Datum | 10 Sept. 2024 |
Publikationsstatus | Veröffentlicht - 2024 |
Abstract
Background: Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformation. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. Results: By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms the state-of-the-art method CMC by approximately 8% and the other state-of-the-art methods cooler, LZMA, and bzip2 by over 50% across multiple cell lines and contact matrix resolutions. In addition, HiCMC integrates domain-specific information into the compressed bitstreams that it generates, and this information can be used to speed up downstream analyses. Conclusion: HiCMC is a novel compression approach that utilizes intrinsic properties of contact matrix, such as compartments and domains. It allows for a better compression in comparison to the state-of-the-art methods. HiCMC is available at https://github.com/sXperfect/hicmc.
ASJC Scopus Sachgebiete
- Biochemie, Genetik und Molekularbiologie (insg.)
- Strukturelle Biologie
- Biochemie, Genetik und Molekularbiologie (insg.)
- Biochemie
- Biochemie, Genetik und Molekularbiologie (insg.)
- Molekularbiologie
- Informatik (insg.)
- Angewandte Informatik
- Mathematik (insg.)
- Angewandte Mathematik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: BMC BIOINFORMATICS, Jahrgang 25, Nr. 1, 296, 2024.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - HiCMC
T2 - High-Efficiency Contact Matrix Compressor
AU - Adhisantoso, Yeremia Gunawan
AU - Körner, Tim
AU - Müntefering, Fabian
AU - Ostermann, Jörn
AU - Voges, Jan
N1 - Publisher Copyright: © The Author(s) 2024.
PY - 2024
Y1 - 2024
N2 - Background: Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformation. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. Results: By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms the state-of-the-art method CMC by approximately 8% and the other state-of-the-art methods cooler, LZMA, and bzip2 by over 50% across multiple cell lines and contact matrix resolutions. In addition, HiCMC integrates domain-specific information into the compressed bitstreams that it generates, and this information can be used to speed up downstream analyses. Conclusion: HiCMC is a novel compression approach that utilizes intrinsic properties of contact matrix, such as compartments and domains. It allows for a better compression in comparison to the state-of-the-art methods. HiCMC is available at https://github.com/sXperfect/hicmc.
AB - Background: Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformation. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. Results: By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms the state-of-the-art method CMC by approximately 8% and the other state-of-the-art methods cooler, LZMA, and bzip2 by over 50% across multiple cell lines and contact matrix resolutions. In addition, HiCMC integrates domain-specific information into the compressed bitstreams that it generates, and this information can be used to speed up downstream analyses. Conclusion: HiCMC is a novel compression approach that utilizes intrinsic properties of contact matrix, such as compartments and domains. It allows for a better compression in comparison to the state-of-the-art methods. HiCMC is available at https://github.com/sXperfect/hicmc.
KW - 3C
KW - Compression
KW - Contact matrix
KW - Hi-C
UR - http://www.scopus.com/inward/record.url?scp=85203473959&partnerID=8YFLogxK
U2 - 10.1186/s12859-024-05907-2
DO - 10.1186/s12859-024-05907-2
M3 - Article
C2 - 39256681
AN - SCOPUS:85203473959
VL - 25
JO - BMC BIOINFORMATICS
JF - BMC BIOINFORMATICS
SN - 1471-2105
IS - 1
M1 - 296
ER -