Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Ana A. Hernandez-Lopez
  • Jan Voges
  • Claudio Alberti
  • Marco Mattavelli
  • Jörn Ostermann

Research Organisations

External Research Organisations

  • École polytechnique fédérale de Lausanne (EPFL)
View graph of relations

Details

Original languageEnglish
Title of host publicationProceedings - DCC 2018
Subtitle of host publication2018 Data Compression Conference
EditorsAli Bilgin, James A. Storer, Joan Serra-Sagrista, Michael W. Marcellin
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages167-176
Number of pages10
ISBN (electronic)9781538648834
Publication statusPublished - Jul 2018
Event2018 Data Compression Conference, DCC 2018 - Snowbird, United States
Duration: 27 Mar 201830 Mar 2018

Publication series

NameData Compression Conference Proceedings
Volume2018-March
ISSN (Print)1068-0314

Abstract

High-throughput sequencing of RNA molecules has enabled the quantitative analysis of gene expression at the expense of storage space and processing power. To alleviate these problems, lossy compression methods of the quality scores associated to RNA sequencing data have recently been proposed, and the evaluation of their impact on downstream analyses is gaining attention. In this context, this work presents a first assessment of the impact of lossily compressed quality scores in RNA sequencing data on the performance of some of the most recent tools used for differential gene expression.

Keywords

    Gene expression, Lossy compression, Quality scores, RNA seq

ASJC Scopus subject areas

Cite this

Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis. / Hernandez-Lopez, Ana A.; Voges, Jan; Alberti, Claudio et al.
Proceedings - DCC 2018: 2018 Data Compression Conference. ed. / Ali Bilgin; James A. Storer; Joan Serra-Sagrista; Michael W. Marcellin. Institute of Electrical and Electronics Engineers Inc., 2018. p. 167-176 (Data Compression Conference Proceedings; Vol. 2018-March).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Hernandez-Lopez, AA, Voges, J, Alberti, C, Mattavelli, M & Ostermann, J 2018, Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis. in A Bilgin, JA Storer, J Serra-Sagrista & MW Marcellin (eds), Proceedings - DCC 2018: 2018 Data Compression Conference. Data Compression Conference Proceedings, vol. 2018-March, Institute of Electrical and Electronics Engineers Inc., pp. 167-176, 2018 Data Compression Conference, DCC 2018, Snowbird, United States, 27 Mar 2018. https://doi.org/10.1109/DCC.2018.00025
Hernandez-Lopez, A. A., Voges, J., Alberti, C., Mattavelli, M., & Ostermann, J. (2018). Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis. In A. Bilgin, J. A. Storer, J. Serra-Sagrista, & M. W. Marcellin (Eds.), Proceedings - DCC 2018: 2018 Data Compression Conference (pp. 167-176). (Data Compression Conference Proceedings; Vol. 2018-March). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DCC.2018.00025
Hernandez-Lopez AA, Voges J, Alberti C, Mattavelli M, Ostermann J. Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis. In Bilgin A, Storer JA, Serra-Sagrista J, Marcellin MW, editors, Proceedings - DCC 2018: 2018 Data Compression Conference. Institute of Electrical and Electronics Engineers Inc. 2018. p. 167-176. (Data Compression Conference Proceedings). doi: 10.1109/DCC.2018.00025
Hernandez-Lopez, Ana A. ; Voges, Jan ; Alberti, Claudio et al. / Lossy compression of quality scores in differential gene expression : A first assessment and impact analysis. Proceedings - DCC 2018: 2018 Data Compression Conference. editor / Ali Bilgin ; James A. Storer ; Joan Serra-Sagrista ; Michael W. Marcellin. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 167-176 (Data Compression Conference Proceedings).
Download
@inproceedings{34c270571ded415aa94ad60f5ff0d5a9,
title = "Lossy compression of quality scores in differential gene expression: A first assessment and impact analysis",
abstract = "High-throughput sequencing of RNA molecules has enabled the quantitative analysis of gene expression at the expense of storage space and processing power. To alleviate these problems, lossy compression methods of the quality scores associated to RNA sequencing data have recently been proposed, and the evaluation of their impact on downstream analyses is gaining attention. In this context, this work presents a first assessment of the impact of lossily compressed quality scores in RNA sequencing data on the performance of some of the most recent tools used for differential gene expression.",
keywords = "Gene expression, Lossy compression, Quality scores, RNA seq",
author = "Hernandez-Lopez, {Ana A.} and Jan Voges and Claudio Alberti and Marco Mattavelli and J{\"o}rn Ostermann",
year = "2018",
month = jul,
doi = "10.1109/DCC.2018.00025",
language = "English",
series = "Data Compression Conference Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "167--176",
editor = "Ali Bilgin and Storer, {James A.} and Joan Serra-Sagrista and Marcellin, {Michael W.}",
booktitle = "Proceedings - DCC 2018",
address = "United States",
note = "2018 Data Compression Conference, DCC 2018 ; Conference date: 27-03-2018 Through 30-03-2018",

}

Download

TY - GEN

T1 - Lossy compression of quality scores in differential gene expression

T2 - 2018 Data Compression Conference, DCC 2018

AU - Hernandez-Lopez, Ana A.

AU - Voges, Jan

AU - Alberti, Claudio

AU - Mattavelli, Marco

AU - Ostermann, Jörn

PY - 2018/7

Y1 - 2018/7

N2 - High-throughput sequencing of RNA molecules has enabled the quantitative analysis of gene expression at the expense of storage space and processing power. To alleviate these problems, lossy compression methods of the quality scores associated to RNA sequencing data have recently been proposed, and the evaluation of their impact on downstream analyses is gaining attention. In this context, this work presents a first assessment of the impact of lossily compressed quality scores in RNA sequencing data on the performance of some of the most recent tools used for differential gene expression.

AB - High-throughput sequencing of RNA molecules has enabled the quantitative analysis of gene expression at the expense of storage space and processing power. To alleviate these problems, lossy compression methods of the quality scores associated to RNA sequencing data have recently been proposed, and the evaluation of their impact on downstream analyses is gaining attention. In this context, this work presents a first assessment of the impact of lossily compressed quality scores in RNA sequencing data on the performance of some of the most recent tools used for differential gene expression.

KW - Gene expression

KW - Lossy compression

KW - Quality scores

KW - RNA seq

UR - http://www.scopus.com/inward/record.url?scp=85050981624&partnerID=8YFLogxK

U2 - 10.1109/DCC.2018.00025

DO - 10.1109/DCC.2018.00025

M3 - Conference contribution

AN - SCOPUS:85050981624

T3 - Data Compression Conference Proceedings

SP - 167

EP - 176

BT - Proceedings - DCC 2018

A2 - Bilgin, Ali

A2 - Storer, James A.

A2 - Serra-Sagrista, Joan

A2 - Marcellin, Michael W.

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 27 March 2018 through 30 March 2018

ER -

By the same author(s)