M.Sc Thesis

M.Sc StudentDamti Yanir
SubjectAdaptive Methods for Computing and Comparing Evolutionary
DepartmentDepartment of Computer Science
Supervisors PROFESSOR EMERITUS Shlomo Moran
DR. Ilan Gronau
PROF. Irad Yavneh
Full Thesis textFull thesis text - English Version


Distance based methods for reconstruction of phylogenetic trees consist of two independent parts: First, inter-species distances are estimated assuming some stochastic model of sequence evolution; then the inferred distances are used to construct the tree. This research focuses on the first part?increasing accuracy in distance estimation. We extend the adaptive distances approach introduced in a recent line of work to make it more applicable to actual phylogenetic reconstruction problems. A previous work characterized the family of valid distance functions for the assumed evolution model (substitution rate functions) and showed that deliberate selection of a distance function, adapted to the input data, significantly improves the accuracy of distance estimates. This was demonstrated by developing a concrete adaptive method for the commonly used evolutionary model known as Kimura’s 2 parameter model (K2P). Distance based reconstruction methods rely heavily on comparisons of the inferred distances and of linear combinations thereof. A consequence of this observation is that the quality of reconstruction depends on the relation between distances, rather than the exact values of the estimated distances. In the main part of this thesis we show how to accurately and efficiently compare two evolutionary distances, and set the ground for more complicated comparisons (e.g., comparisons of sums of distances, used in the 4 points method for quartets reconstruction). We present a method for comparison of two distances in the K2P model that has accuracy of maximum-likelihood based methods, while being more computationally efficient, and avoiding the need of certain assumptions on the model (homogeneity). Two more objectives of this thesis are (1) to extend the adaptive approach to additional evolutionary models. We develop concrete adaptive methods for the F84 and Tamura-Nei models which extend K2P; and (2) incorporate the adaptive methods into existing phylogeny software. We chose to expand the widely used open source phylogeny software PHYLIP (the PHYLogeny Inference Package). Among other phylogeny inference capabilities PHYLIP implements, it allows computation of evolutionary distances between DNA sequences. In this thesis, we added to the existing framework the ability to compute distances adaptively for the K2P model.