Abstract
The human genome consists of 98.5% non-coding DNA sequences, and most of them have no known function. However, a majority of disease-associated variants lie in these regions. Therefore, it is critical to predict the function of non-coding DNA. Hence, we propose the NCNet, which integrates deep residual learning and sequence-to-sequence learning networks, to predict the transcription factor (TF) binding sites, which can then be used to predict non-coding functions. In NCNet, deep residual learning networks are used to enhance the identification rate of regulatory patterns of motifs, so that the sequence-to-sequence learning network may make the most out of the sequential dependency between the patterns. With the identity shortcut technique and deep architectures of the networks, NCNet achieves significant improvement compared to the original hybrid model in identifying regulatory markers.
Original language | English |
---|---|
Article number | 432 |
Journal | Frontiers in Genetics |
Volume | 10 |
Issue number | MAY |
DOIs | |
State | Published - 2019 |
Bibliographical note
Publisher Copyright:Copyright © 2019 Zhang, Hung, Liu, Hu and Lin.
Keywords
- Deep learning
- LSTM
- Non-coding DNA
- Residual learning
- Sequence to sequence learning