Monday, May 26, 2008

False Discovery Rate (FDR) Reliability Evaluation

In analysis involving multiple tests, we can control false discovery rate (FDR) by several approaches, such as Benjamini-Hochberg adjusted p value (BH adj.p value), Storey-Tibshirani q value (q value), as well as BH adj.p value and q value based on moderated t-test after empirical bayes adjustment.

The question of interest is which method gives us a better control of FDR. That is, when we want to control FDR at alpha level, which gives us a more reliable control such that the FDR will not deviate far from alpha.

To study this problem, we simulated 14,118 genes with expression under two treatments where the first n (n<14,118) were designed to be truly differentially expressed. Then, we applied three methods for identifying differentially expressed genes - (1) q value based on usual t-test (2) BH adj.p value based on usual t-test (3) q value based on moderated t-test. The purpose is to examine the precision and accuracy of controlling FDR by these different approaches. One result of several obtained results looks like this: (simulate 100 times of 14,118 gene expression under 2 treatment with first 3000 genes truly differently expressed. Set alpha level at 0.05)
Summary statistics are listed as follows:

(1) q value based on usual t statistics

mean.q :0.04565834

var.q :6.964762e-05

(2) BH adj.p value based on usual t statistics

mean.bh.p:0.03763231

var.bh.p:6.955762e-05

(3) q value based on moderated t statistics

mean.ebayes.q:0.04702102

var.ebayes.q:3.80829e-05

The above statistics tell us that the q value based on the empirical Bayes moderated t test gives the best precision and accuracy. Similar tests can be run many times to compare and evaluate these FDR control approaches.

0 comments: