We derive the exact one-step transition probabilities of the number of lineages that are ancestral to a random sample from the current generation of a bi-parental population that is evolving under the discrete Wright-Fisher model with $n$ diploid individuals. Our model allows for a per-generation recombination probability of $r$. When $r=1$ our model is equivalent to Chang's model for the zygotic pedigree (Adv. in Appl. Probab., 31(4), 1999). When $r=0$ our model is equivalent to Kingman's coalescent model (Jnl. of Appl. Prob., 19, 1982) for the cytoplasmic, mitochondrial or sub-karyotic tree defined by a DNA locus that is free of intra-locus recombination, and when $0<r<1$ our model can be thought to track the cytoplasmic pedigree with paternal leakage probability $r$ or to track a sub-karyotic pedigree defined by a haploid DNA locus from an autosomal chromosome that has an intra-locus recombination probability $r$. Thus, our discrete-time Markov chain model is an $r$-parameterized family that contains Chang's model for the zygotic pedigree, Kingman's discrete coalescent model for the cytoplasmic tree, and the discrete sub-karyotic pedigree model that may be approximated by Hudson's (Theor. Popn. Biol., 23, 1983) and Griffiths' (IMS Lecture Notes, v. 18, 1989) ancestral recombination graph (ARG).
We provide the first explicit transition probabilities of this discrete time Markov chain of the number of ancestors to a random sample from the present time and study its stationary distribution by explicit counts of appropriate bi-partite graphs. We study three properties of this $r$-specific ancestral size Markov chain: time to most recent common ancestor (MRCA) of the population, time at which all individuals are either common ancestors to all present day individuals or ancestral to none of the present day individuals, and the fraction of common ancestors at this time.
These results generalize the two main results in Chang's model and allow for the existence of limiting combinatorial structures that are more realistic and general than the ARG that has come to dominate modern population genetic analyses. Our probabilistic structures are a pre-requisite for statistically consistent inference using population genomic data from arbitrarily sampled individuals from a bi-parental recombining population.