|Ph.D Student||Cohen Rami|
|Subject||Internet Topology: From the Discovery Process to the Real|
|Department||Department of Computer Science||Supervisor||Professor Dan Raz|
|Full Thesis text|
In this research we study the topological structure of the Internet infrastructure in the Autonomous System (AS) level. First, we question the basic problem: how big is the Internet. In the AS level this means: how many peering relations exist between ASes. We present strong evidence to the fact that a considerable amount (at least 35%) of the links in the AS level are still to be unveiled and our findings indicate that almost all these missing links are of type peer-peer. We also examine the vertex degree distribution of the AS connectivity map, and show that while the customer-provider subgraph follow the power law, the peer-peer subgraph behave differently. Next, we study, the Type of Relationship (ToR) problem. It turns out that in the original ToR problem hierarchical cycles may be formed, while this is impossible in the real internet. To address this point, we define a new problem called Acyclic Type of Relationship, AToR. We present an efficient algorithm determining whether an acyclic solution without invalid paths exists, and we show that the general decision version of the problem is NP-hard. We also present 2/3-approximation algorithm for a maximum version of the problem, and consider practical aspects of inferring the actual type of relationships between ASes. This includes heuristics to infer also peer-peer and sibling-sibling relationships. We support our approach by experimental and simulation results showing that our algorithms classify the type of relationship between ASes much better than all previous algorithms. Finally, we study the path inflation phenomenon in which, due to BGP routing policy, routing between ASes is not necessarily done along shortest paths. To overcome this problem and to route packets over shortest paths after all, we consider an intermediate routing scheme. We study this problem as an optimization problem where the objective target is to enable routing through shortest paths with a minimum number of relay servers. We show that this problem is NP-hard, present a O(log(n)) approximation algorithm for it, and show that this is the best approximation one can get. We examine the practical aspects of the scheme by evaluating the gain one can get over real up-to-date data reflecting the current BGP routing policy in the Internet. We show that a relative small number of relay servers are sufficient to enable routing over the shortest paths between almost all considered nodes.