Parameter determination and feature selection for C4.5 algorithm using scatter search approach

Shih Wei Lin*, Shih Chieh Chen

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

39 Scopus citations

Abstract

The C4. 5 decision tree (DT) can be applied in various fields and discovers knowledge for huma can contain numerous features, not all features are beneficial for classification in C4. 5 algorithm. Therefore, a novel scatter search-based approach (SS + DT) is proposed to acquire optimal parameter settings and to select the beneficial subset of features that result in better classification results. To evaluate the efficiency of the proposed SS + DT approach, datasets in the UCI (University of California, Irvine) Machine Learning Repository are utilized to assess the performance of the proposed approach. Experimental results demonstrate that the parameter settings for the C4. 5 algorithm obtained by the SS + DT approach are better than those obtained by other approaches. When feature selection is considered, classification accuracy rates on most datasets are increased. Therefore, the proposed approach can be utilized to identify effectively the best parameter settings for C4. 5 algorithm and useful features.

Original languageEnglish
Pages (from-to)63-75
Number of pages13
JournalSoft Computing
Volume16
Issue number1
DOIs
StatePublished - 01 2012
Externally publishedYes

Keywords

  • C4.5
  • Decision tree
  • Feature selection
  • Optimization
  • Scatter search

Fingerprint

Dive into the research topics of 'Parameter determination and feature selection for C4.5 algorithm using scatter search approach'. Together they form a unique fingerprint.

Cite this