SNP's in the Human Genome

Rick Durrett and Vlada Limic

Abstract. Single nucleotide polymorphisms (SNPs) are single nucletoides (i/e/. the A's, T's, G's, C's that make up the genome) that are polymorphic, i.e., the most common allele has frequency less than 99%. They are useful markers for locating genes since they occur throughout the human genome and thousands can be scored at once using DNA microarrays. Here we use branching processes and coalescent theory to show that if one uses Kruglyak's (1999) model of the growth of the human population and one assumes an average mutation rate of 1 x 10^{-8} per nucleotide per generation then there are about 2.8 million SNPs in the human genome or one very 529 base paris. We also obtain results for the number of SNPs that will be found in samples. When n = 5, which roughly corresponds to Celera's sequencing the human genome, an average of 3.1 million nucleotides will be variable in the sample. However, only about 70% of these cases or about 2.3 million will be polymorphic. This is very close to the 2.4 million Celera claims to have found

Get a copy of the paper in Postscript or Portable Documentation Format


Back to Durrett's home page