MATH 675:
Statistical Theory Applicable to Genomics (Fall 2006)
Instructor:
J. T. Gene Hwang
Meeting
Time & Room
There are many statistical concepts that are useful in Genomics. One
particular problem with Genomics (e.g. Microarray Data Analysis) is that
the number of populations or Genes is large. As a result there are a
huge
number of hypotheses. How to test these type of hypotheses
simultaneously? We will discuss concepts such as family-wise error rate,
false discovery rate (FDR) of Benjamini and Hochberg(1995 JRSS B) and
Storey's papers relating to pFDR. We will also discuss the fundamental
cornerstone of multiple testing, the closed testing method. A shortcut
algorithm is called the stepdown testing. See Westfall and Young(1993).
What
other statistical inferential technique may be useful for a large number
of populations or Genes? The tradition one population approach assuming
that all populations are different is too inefficient. It seems interesting
and important to have techniques that can combine all observations from
all populations together and when the populations are similar they "borrow
the strength" from each other and when
the
populations are very different they go separate ways. In fact, Shrinkage
(or Empirical Bayes) technique, or equivalently the BLUP in mixed model
can do this. So the course will spend some time discussing these
techniques. We will discuss the point estimation and the confidence
interval construction. A new approach called the selected mean approach
proves to be promising and will be discussed.
Other topics may include
permutation tests and QTL identification if time allows. This course
is mainly about the (mathematical) statistical theory and hence in many
lectures the focus was to prove theorems. It is recommended that you
should have some statistic courses such as OR&IE 670 or MATH 674.
Last modified:
April 25, 2006
|