Predicting mutagenicity of aromatic amines by various machine learning approaches

Max K. Leong*, Sheng Wen Lin, Hong Bin Chen, Fu Yuan Tsai

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

38 Scopus citations

Abstract

Aromatic amines are prevalently used in a wide variety of industries and are ubiquitous in foods and environment. Many of this class of compounds are potentially mutagenic or even carcinogenic, and the assessment and prediction of their mutagenicity are of practical importance because mutagenicity and carcinogenicity are toxicological end points that play major roles in the genesis of cancer and tumor. Quantitative structure-activity relationship of a homogeneous set of mutagenicity data (TA98 + S9), which was comprehensively compiled from literature, was developed by four machine learning methods, namely hierarchical support vector regression (HSVR), support vector machine, radial basis function neural networks, and genetic function algorithm. The predictions by these models are in good agreement with the experimental observations for those molecules in the training set (n = 97, r2 = 0.78-0.93, q2 = 0.64-0.93, root mean square error [RMSE] 5 0.51-0.90, SD = 0.34-0.56) and the test set (n = 25, r2 = 0.73-0.85, RMSE = 0.65-0.85, SD = 0.33-0.51). In addition, several validation criteria were adopted to verify those generated models, and a set of outliers was deliberately selected to examine the robustness of these four predictive models (n = 14, r2 = 0.35-0.84, RMSE = 0.55-1.21, SD = 0.25-0.72). Finally, various cross-comparison schemes, namely forward comparisons, backward comparisons, and most common molecule comparisons, with assorted published predictive models were carried out. Our results indicate that the HSVR model is the most accurate, robust, and consistent and can be employed as a tool for predicting mutagenicity of aromatic amines.

Original languageEnglish
Pages (from-to)498-513
Number of pages16
JournalToxicological Sciences
Volume116
Issue number2
DOIs
StatePublished - 27 05 2010

Keywords

  • Aromatic amines
  • Genetic function algorithm
  • Hierarchical support vector regression
  • Machine learning
  • Mutagenicity
  • Radial basis function neural networks
  • Support vector machine

Fingerprint

Dive into the research topics of 'Predicting mutagenicity of aromatic amines by various machine learning approaches'. Together they form a unique fingerprint.

Cite this