Genomic splice site prediction algorithm based on nucleotide sequence pattern for RNA viruses

Kun Nan Tsai, Shu Hung Lin, Shin Ru Shih, Jhih Siang Lai, Chung Ming Chen*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

8 Scopus citations

Abstract

Splice site prediction on an RNA virus has two potential difficulties seriously degrading the performance of most conventional splice site predictors. One is a limited number of strains available for a virus species and the other is the diversified sequence patterns around the splice sites caused by the high mutation frequency. To overcome these two difficulties, a new algorithm called Genomic Splice Site Prediction (GSSP) algorithm, was proposed for splice site prediction of RNA viruses. The key idea of the GSSP algorithm was to characterize the interdependency among the nucleotides and base positions based on the eigen-patterns. Identified by a sequence pattern mining technique, each eigen-pattern specified a unique composition of the base positions and the nucleotides occurring at the positions. To remedy the problem of insufficient training data due to the limited number of strains for an RNA virus, a cross-species strategy was employed in this study. The GSSP algorithm was shown to be effective and superior to two conventional methods in predicting the splice sites of five RNA species in the Orthomyxoviruses family. The sensitivity and specificity achieved by the GSSP algorithm was higher than 99 and 94%, respectively, for the donor sites, and was higher than 96 and 92%, respectively, for the acceptor sites. Supplementary data associated with this work are freely available for academic use at http://homepage.ntu.edu.tw/∼d91548013/.

Original languageEnglish
Pages (from-to)171-175
Number of pages5
JournalComputational Biology and Chemistry
Volume33
Issue number2
DOIs
StatePublished - 04 2009

Keywords

  • Cross-species strategy
  • Eigen-pattern
  • Orthomyxovirus
  • RNA virus
  • Splice site prediction

Fingerprint

Dive into the research topics of 'Genomic splice site prediction algorithm based on nucleotide sequence pattern for RNA viruses'. Together they form a unique fingerprint.

Cite this