TY - JOUR
T1 - ARGNet
T2 - using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences
AU - Pei, Yao
AU - Shum, Marcus Ho Hin
AU - Liao, Yunshi
AU - Leung, Vivian W.
AU - Gong, Yu Nong
AU - Smith, David K.
AU - Yin, Xiaole
AU - Guan, Yi
AU - Luo, Ruibang
AU - Zhang, Tong
AU - Lam, Tommy Tsan Yuk
N1 - © 2024. The Author(s).
PY - 2024/5/9
Y1 - 2024/5/9
N2 - BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing.RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG.CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.
AB - BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing.RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG.CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.
KW - Antibiotic resistance
KW - Antibiotic resistance genes
KW - ARGNet
KW - Autoencoder
KW - Deep learning
KW - Multiclass classification convolutional neural network
KW - Neural Networks, Computer
KW - Bacteria/genetics
KW - Humans
KW - Computational Biology/methods
KW - Anti-Bacterial Agents/pharmacology
KW - Genes, Bacterial/genetics
KW - Deep Learning
KW - High-Throughput Nucleotide Sequencing/methods
KW - Drug Resistance, Bacterial/genetics
KW - Drug Resistance, Microbial/genetics
UR - http://www.scopus.com/inward/record.url?scp=85192569064&partnerID=8YFLogxK
U2 - 10.1186/s40168-024-01805-0
DO - 10.1186/s40168-024-01805-0
M3 - 文章
C2 - 38725076
AN - SCOPUS:85192569064
SN - 2049-2618
VL - 12
SP - 84
JO - Microbiome
JF - Microbiome
IS - 1
M1 - 84
ER -