MATH 675
High Dimension Statistical Inference with Applications to Genomics
(Fall 2003)
Instructor:
Gene Hwang
Meeting
Time & Room
There are many statistical concepts that are useful in Genomics.
One particular problem with Genomics (e.g. Microarray Data Analysis) is
that the number of populations or Genes is large. As a result there are
a huge number of hypotheses. How to test these type of hypotheses simultaneously?
We will discuss concepts such as family-wise error rate, false discovery
rate (FDR) of Benjamini and Hochberg (1995 JRSS B) and Storey's papers
relating to pFDR. We will also discuss the fundamental cornerstone of
multiple testing, the closed testing method. A shortcut algorithm is called
the stepdown testing. See Westfall and Young (1993).
What other statistical inferential technique may be useful
for a large number of populations or Genes? The tradition one population
approach assuming that all populations are different is too inefficient.
It seems interesting to discuss techniques that can combine all observations
from all populations together and when the populations are similar they
"borrow the strength" from each other and when the populations
are very different they go separate ways. In fact, Shrinkage (or Empirical
Bayes) technique, or equivalently the BLUP in mixed model can do this.
So the course will spend some time discussing these techniques. We will
discuss the point estimation and the confidence interval construction.
A new approach called the selected mean approach proves to be promising
and will be discussed.
Other topics include permutation tests, Bootstrapping, and
QTL identification.
Last modified:
August 13, 2003
|