跳至主導覽 跳至搜尋 跳過主要內容

Comparison of Feature Selection Methods for Cross-Laboratory Microarray Analysis

  • Hsi Che Liu
  • , Pei Chen Peng
  • , Tzung Chien Hsieh
  • , Ting Chi Yeh
  • , Chih Jen Lin
  • , Chien Yu Chen
  • , Jen Yin Hou
  • , Lee Yung Shih
  • , Der Cherng Liang
  • Mackay Memorial Hospital Taiwan
  • National Taiwan University
  • Chang Gung Memorial Hospital
  • Chang Gung University

研究成果: 期刊稿件文章同行評審

17 引文 斯高帕斯(Scopus)

摘要

The amount of gene expression data of microarray has grown exponentially. To apply them for extensive studies, integrated analysis of cross-laboratory (cross-lab) data becomes a trend, and thus, choosing an appropriate feature selection method is an essential issue. This paper focuses on feature selection for Affymetrix (Affy) microarray studies across different labs. We investigate four feature selection methods: (t)-test, significance analysis of microarrays (SAM), rank products (RP), and random forest (RF). The four methods are applied to acute lymphoblastic leukemia, acute myeloid leukemia, breast cancer, and lung cancer Affy data which consist of three cross-lab data sets each. We utilize a rank-based normalization method to reduce the bias from cross-lab data sets. Training on one data set or two combined data sets to test the remaining data set(s) are both considered. Balanced accuracy is used for prediction evaluation. This study provides comprehensive comparisons of the four feature selection methods in cross-lab microarray analysis. Results show that SAM has the best classification performance. RF also gets high classification accuracy, but it is not as stable as SAM. The most naive method is (t)-test, but its performance is the worst among the four methods. In this study, we further discuss the influence from the number of training samples, the number of selected genes, and the issue of unbalanced data sets.

原文英語
文章編號6531614
頁(從 - 到)593-604
頁數12
期刊IEEE/ACM Transactions on Computational Biology and Bioinformatics
10
發行號3
DOIs
出版狀態已出版 - 01 05 2013
對外發佈

文獻附註

Publisher Copyright:
© 2013 IEEE.

UN SDG

此研究成果有助於以下永續發展目標

  1. SDG3 健康與福祉
    SDG3 健康與福祉

指紋

深入研究「Comparison of Feature Selection Methods for Cross-Laboratory Microarray Analysis」主題。共同形成了獨特的指紋。

引用此