טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentNoam Kremen
SubjectPredicting Protein Function Based on the Local Differential
Geometric Properties of its Molecular Surface
DepartmentDepartment of Biology
Supervisor Full Professor Mandel-Gutfreun Yael
Full Thesis textFull thesis text - English Version


Abstract

The ability to sequence entire genomes has revolutionized biology. And yet, as more genomic sequence data is accumulated, new questions arise  regarding  the function of genes in the genomes and the mechanism by which the genes and their protein products work, reemphasizing the great need for efficient methods for functional annotation of genes and proteins. Owing to a number of crystallography-focused scientific initiatives,  to-date there are more than 100,000 protein and nucleic acid structures stored in the Protein Data Bank (PDB) providing insight into the three dimensional arrangement of these molecules and their cellular  function .

While structure-similarity (homology) based methods have been the most commonly used approaches  for annotating  new protein structures, as the number of structures grew, so did the fraction of proteins that had no previously annotated structural homologues. As these structures lack both sequence and structure similarity to any previously discovered protein, methods that do not rely on homology are required for their function annotation. This thesis introduces a structural based approach for functional annotation of proteins which does not rely on homology.  By analyzing the curvature of the protein’s surface we aim to create a new way to describe functional similarity, without relying neither on sequence nor on the secondary or tertiary structural similarity of the proteins.  Specifically, we calculate the curvature of the proteins’ molecular surface and show that the curvature’s distribution has no strong correlation to the protein’s three-dimensional fold. Finally we have succeeded to train a machine learning algorithm to distinguish between nucleic-acid binding proteins to protein binding proteins. Moreover when concentrating on DNA binding proteins we could accurately distinguish between regulatory DNA binding proteins and DNA binding proteins which have an enzymatic activity. Overall this work demonstrates that geometric features of the protein surface can help annotate protein function without relying on known evolutionary relationship, thus enabling the functional annotation of completely novel proteins given their three dimensional structure.