M.Sc Thesis
M.Sc Student Granot Yaron The Equidistance Index of Population Structure Department of Medicine Professor Karl Skorecki

Abstract

Measures of differentiation are central to population genetics, with applications ranging from phylogenetics and conservation biology to medical genomics. One of the most widely used statistics for summarizing population differentiation is FST, derived from the ratio of genetic diversity within and between populations. However, because FST treats populations as points (means) rather than ranges (deviations from the mean), it does not directly reflect the strength of separation between population clusters. In order to obtain a measure that is more consistent with clustering and classification, we propose substituting the mean in the FST equation with the standard deviation, thereby deriving a novel measure of population separability, denoted EST.

Conceptually, EST is formulated in three steps: (1) Population cluster tightness is defined in terms of departures from random mating or panmixia. (2) Panmixia is defined in terms of pairwise equidistance between individuals. (3) Departures from equidistance are defined in terms of the standard deviation of pairwise distances. EST reflects the decrease in panmixia when subpopulations are pooled. The general formula is: EST = 1-SDS/SDT, where SDS and SDT represent the standard deviations of pairwise distances in subpopulations and in the total (pooled) population. Unlike FST, which is weighed down by diversity within populations, EST is weighed down by structure or stratification within populations. Since diversity is usually much greater than stratification, EST is usually much higher than FST. Both metrics typically have a 0-1 range, where high EST values (approaching 1) indicate strong population separation, even if the vast majority of genetic variation is shared across populations (FST is close to zero).

We ranked 60 Human (HGDP) population pairs based on FST and EST. FST ranged from 0 to 0.3, averaging at 0.07, while EST spanned nearly the entire 0-1 range, with an average of 0.66. FST values corresponding to EST ≈ 0.5 range from 0.005 in large panmictic populations to 0.1 in small isolated populations. The wide range of FST values for a given level of EST implies that the two metrics capture different aspects of differentiation, a discrepancy that is most striking between the two Amazonian tribes. In terms of FST, the Karitiana are as diverged from the nearby Surui (FST=0.13) as they are from the Mongola on the other side of the world. By contrast, EST ranking indicates a far greater divergence between the Karitiana and Mongola (0.87) than between the Karitiana and Surui (0.58), which seems more consistent with classification based on phenotypic, linguistic, or geographic classification. Thus, EST may at times outperform FST in identifying evolutionarily significant differentiation.