Estimating the proportion of true null hypotheses, pi0, has become a common topic in recent statistical literature. There are many existing methodologies that estimate the proportion of true null p-values, however most assume independence among genes, which is often untrue. Simulations conducted using these methods often result in estimations with high variances in the presence of dependence among test statistics. In this thesis, we propose three data driven methods for estimating pi 0 that use the distribution pattern of the observed p-values to take into account dependence among test statistics. Specifically, we use a linear fit to estimate the proportion of true null p-values on (lambda, 1] over the whole range [0, 1], rather than using the expected proportion at 1 -- lambda. Our simulations show that the proposed estimators significantly decrease the variance under the dependence assumption and perform favorably when compared to estimations produced by other methods.
展开▼