Ph.D Thesis

Ph.D StudentLipson Doron
SubjectComputational Aspects of DNA Copy Number Measurement
DepartmentDepartment of Computer Science
Supervisor ASSOCIATE PROF. Zohar Yakhini
Full Thesis textFull thesis text - English Version


Alterations in DNA copy number are characteristic to many cancer types and are known to drive some cancer pathogenesis processes. These alterations include large chromosomal gains and losses as well as smaller scale amplifications and deletions. Mapping regions of genomic aberration can provide insight to cancer pathogenesis and lead to discovery of cancer-related genes and the mechanisms by which they drive the disease.

High-resolution array comparative genomic hybridization (aCGH) is a recently-developed technology for mapping copy number changes in genomic DNA. In this thesis, I present the work I have done, together with different collaborators, on development of computational tools and methods for design of aCGH arrays and analysis of DNA copy number data.

Design of CGH arrays involves a multi-parameter optimization problem in which the set of selected probes is optimized according to constraints of specificity, sensitivity and coverage. Here I describe the computational aspects of the design of one of the first oligonucleotide-based CGH arrays put into practice. Methods for optimizing probe coverage, such as the ones described here, allow mapping of genomic breakpoints at exon-level accuracy and support obtaining high-resolution information on new genomic constructs.

Analysis of aCGH data involves tasks related to identification of the genomic aberration structure of a measured sample, based on the CGH signal, and to interpreting the biological functions that are affected by genomic alterations. Here I describe Stepgram, a method for detecting genomic aberrations based on a statistical interval score, considered to be one of the most efficient algorithms for this task and implemented in several software packages. Stepgram also plays an important role in a new algorithm for normalization of aCGH data. In addition, I present a new algorithm (CoCoA) for detecting genomic aberrations that are common to multiple cancer samples in an aCGH dataset. Detection of common recurring aberrations allows focusing on events that may have an important role in carcinogenesis.

Finally, I describe recent work applying some of these methods to a panel of 60 cancer cell-lines (NCI-60), and integrating the DNA copy number data with related expression profiles and drug sensitivity profiles. Preliminary results show interesting new correlations between genomic aberrations and sensitivity to specific chemical compounds suggesting causal relations which may be of importance in developing cancer therapeutics. In addition, I describe the use of aCGH analysis tools in unveiling the replication timing pattern of the mouse genome at a significantly high temporal and genomic resolution.