Abstract
因應醫學研究水平的日益嚴格,統計分析程度之要求亦愈來愈高,只進行單變項分析往往不足以瞭解數據內較雜的含意,而需要以多變量方法作分析。在進行多變量分析時,頗多研究人員都僅就單變項分析中達到統計意義之自變項作分析,這樣的處理方法有一個很大的缺點,就是有些變項可能在單變項分析中沒有統計意義,但是在多變量分析中卻變成了有意義。本文便是解釋在下例四種情況下會發生上述情形:(一)不同組合的自變項樣本數目不相同;(二)數據中有部分遺漏數值,影響到單變項與多變量分析時所針對的樣本並不完全一樣;(三)數據的「組內差異」過大;(四)交互作用的存在。為了方便讀者們自我測試,除於文內對整個分析過程作詳細交待,以供參考外,並待有原始數據,供讀者索取。此外,本文雖僅以log-rank test及Cox regression作為例子以闡釋個中道理,惟其中概念也可推演至其他多變量分析法如logistic regression及m0ultiple linear regression中。最後建議研究人員在統計分析時宜謹慎處理,多作不同角度的瞭解,以免槽蹋辛苦得來的寶貴資料。
Perhaps as a result of higher research standard and advancement in computer technology, the amount and level of statistical analysis required by medical journals become more and more demanding. It is now realized by researchers that univariate analysis alone may not be sufficient, especially for complex data sets. Additional, and sometimes even contradictory, results may be found using multivariate analysis. During the course of data analysis, a common practice is to include in multivariate analysis only hose variables that are statistically significant in univariate analysis. Such a habit is risky as some variables not significant in univariate analysis may become significant in multivariate analysis. In this study, we dandify, with examples, four possible scenarios in which the above situation could occur: (1) the effect of unbalanced sample size; (2) the influence of missing data; (3) an extremely large within group variation, relative to between group variation; and (4) the presence of interaction. In addition to detailed analysis steps, raw data set are also available of readers to verify all the results presented. Although we only used the log-rank test and Cox regression for illustration purposes, the underlying concepts can be applied to other multivariate procedures such as the logistic regression and multiple linear regression.
Perhaps as a result of higher research standard and advancement in computer technology, the amount and level of statistical analysis required by medical journals become more and more demanding. It is now realized by researchers that univariate analysis alone may not be sufficient, especially for complex data sets. Additional, and sometimes even contradictory, results may be found using multivariate analysis. During the course of data analysis, a common practice is to include in multivariate analysis only hose variables that are statistically significant in univariate analysis. Such a habit is risky as some variables not significant in univariate analysis may become significant in multivariate analysis. In this study, we dandify, with examples, four possible scenarios in which the above situation could occur: (1) the effect of unbalanced sample size; (2) the influence of missing data; (3) an extremely large within group variation, relative to between group variation; and (4) the presence of interaction. In addition to detailed analysis steps, raw data set are also available of readers to verify all the results presented. Although we only used the log-rank test and Cox regression for illustration purposes, the underlying concepts can be applied to other multivariate procedures such as the logistic regression and multiple linear regression.
| Original language | Chinese (Traditional) |
|---|---|
| Pages (from-to) | 95-101 |
| Journal | 長庚醫學 |
| Volume | 18 |
| Issue number | 2 |
| State | Published - 1995 |