M.Sc Student | Granot Yaron |
---|---|

Subject | The Equidistance Index of Population Structure |

Department | Department of Medicine |

Supervisor | Professor Karl Skorecki |

Full Thesis text |

Measures of differentiation are
central to population genetics, with applications ranging from phylogenetics
and conservation biology to medical genomics. One of the most widely used
statistics for summarizing population differentiation is F_{ST},
derived from the ratio of genetic diversity within and between populations.
However, because F_{ST} treats populations as points (means) rather
than ranges (deviations from the mean), it does not directly reflect the
strength of separation between population clusters. In order to obtain a
measure that is more consistent with clustering and classification, we propose
substituting the *mean *in the F_{ST} equation with the *standard
deviation*, thereby deriving a novel measure of population *separability*,
denoted E_{ST}.

Conceptually, E_{ST} is
formulated in three steps: (1) Population cluster tightness is defined in terms
of departures from random mating or *panmixia*. (2) Panmixia is defined in
terms of pairwise *equidistance *between individuals. (3) Departures from
equidistance are defined in terms of the *standard deviation *of pairwise
distances. E_{ST} reflects the decrease in panmixia when subpopulations
are pooled. The general formula is: E_{ST} = 1-SD_{S}/SD_{T},
where SD_{S} and SD_{T} represent the standard deviations of
pairwise distances in subpopulations and in the total (pooled) population.
Unlike F_{ST}, which is weighed down by *diversity *within
populations, E_{ST} is weighed down by *structure *or *stratification
*within populations. Since diversity is usually much greater than
stratification, E_{ST} is usually much higher than F_{ST}. Both
metrics typically have a 0-1 range, where high E_{ST} values
(approaching 1) indicate strong population separation, even if the vast
majority of genetic variation is shared across populations (F_{ST} is
close to zero).

We ranked 60 Human (HGDP) population
pairs based on F_{ST} and E_{ST}. F_{ST} ranged from 0
to 0.3, averaging at 0.07, while E_{ST} spanned nearly the entire 0-1
range, with an average of 0.66. F_{ST} values corresponding to E_{ST}
≈ 0.5 range from 0.005 in large panmictic populations to 0.1 in small
isolated populations. The wide range of F_{ST} values for a given level
of E_{ST} implies that the two metrics capture different aspects of
differentiation, a discrepancy that is most striking between the two Amazonian
tribes. In terms of F_{ST}, the Karitiana are as diverged from the
nearby Surui (F_{ST}=0.13) as they are from the Mongola on the other
side of the world. By contrast, E_{ST} ranking indicates a far greater
divergence between the Karitiana and Mongola (0.87) than between the Karitiana
and Surui (0.58), which seems more consistent with classification based on
phenotypic, linguistic, or geographic classification. Thus, E_{ST} may
at times outperform F_{ST} in identifying evolutionarily significant
differentiation.