PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Annett Erkes
  • Stefanie Mücke
  • Maik Reschke
  • Jens Boch
  • Jan Grau

Research Organisations

External Research Organisations

  • Martin Luther University Halle-Wittenberg
View graph of relations

Details

Original languageEnglish
Article numbere1007206
JournalPLoS Computational Biology
Volume15
Issue number7
Publication statusPublished - 11 Jul 2019

Abstract

Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.

Keywords

    Computational Biology, Genes, Plant, Genome, Plant, Host Microbial Interactions/genetics, Models, Biological, Oryza/genetics, Plant Diseases/genetics, Tandem Repeat Sequences, Transcription Activator-Like Effectors/genetics, Transcription Initiation Site, Virulence/genetics, Xanthomonas/genetics

ASJC Scopus subject areas

Cite this

PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting. / Erkes, Annett; Mücke, Stefanie; Reschke, Maik et al.
In: PLoS Computational Biology, Vol. 15, No. 7, e1007206, 11.07.2019.

Research output: Contribution to journalArticleResearchpeer review

Erkes A, Mücke S, Reschke M, Boch J, Grau J. PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting. PLoS Computational Biology. 2019 Jul 11;15(7):e1007206. doi: 10.1371/journal.pcbi.1007206, 10.15488/10460
Erkes, Annett ; Mücke, Stefanie ; Reschke, Maik et al. / PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting. In: PLoS Computational Biology. 2019 ; Vol. 15, No. 7.
Download
@article{d91b46622fbe4da3b8dd9ef671fbd52b,
title = "PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting",
abstract = "Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.",
keywords = "Computational Biology, Genes, Plant, Genome, Plant, Host Microbial Interactions/genetics, Models, Biological, Oryza/genetics, Plant Diseases/genetics, Tandem Repeat Sequences, Transcription Activator-Like Effectors/genetics, Transcription Initiation Site, Virulence/genetics, Xanthomonas/genetics",
author = "Annett Erkes and Stefanie M{\"u}cke and Maik Reschke and Jens Boch and Jan Grau",
note = "Funding: This work was supported by grants from the Deutsche Forschungsgemeinschaft (http:// www.dfg.de) (BO 768 1496/8-1 to JB and GR 4587/1-1 to JG) and by the COST actions FA1208",
year = "2019",
month = jul,
day = "11",
doi = "10.1371/journal.pcbi.1007206",
language = "English",
volume = "15",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "7",

}

Download

TY - JOUR

T1 - PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting

AU - Erkes, Annett

AU - Mücke, Stefanie

AU - Reschke, Maik

AU - Boch, Jens

AU - Grau, Jan

N1 - Funding: This work was supported by grants from the Deutsche Forschungsgemeinschaft (http:// www.dfg.de) (BO 768 1496/8-1 to JB and GR 4587/1-1 to JG) and by the COST actions FA1208

PY - 2019/7/11

Y1 - 2019/7/11

N2 - Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.

AB - Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.

KW - Computational Biology

KW - Genes, Plant

KW - Genome, Plant

KW - Host Microbial Interactions/genetics

KW - Models, Biological

KW - Oryza/genetics

KW - Plant Diseases/genetics

KW - Tandem Repeat Sequences

KW - Transcription Activator-Like Effectors/genetics

KW - Transcription Initiation Site

KW - Virulence/genetics

KW - Xanthomonas/genetics

UR - http://www.scopus.com/inward/record.url?scp=85070485246&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1007206

DO - 10.1371/journal.pcbi.1007206

M3 - Article

C2 - 31295249

AN - SCOPUS:85070485246

VL - 15

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 7

M1 - e1007206

ER -