M.Sc Thesis

M.Sc StudentNabwani Najeeb
SubjectLearning by Sampling: A Deep Learning Approach to the
Plannted Clique Problem with Unlimited Sampling
DepartmentDepartment of Computer Science
Supervisor PROF. Assaf Schuster
Full Thesis textFull thesis text - English Version


The planted clique problem plays an important role in the field of average-case complexity. The problem involves detecting graphs that contain a hidden clique of size k, which was planted in graphs from the Erdos-Renyi model Gn,0.5. The planted clique conjecture suggests that there is no polynomial-time algorithm that can solve the planted clique problem when 2log2(n) k n. This conjecture has served as the underlying hardness assumption in several different works, in fields such as cryptography and machine learning.

We propose a deep learning framework to build a model that can solve the planted

clique problem with a high probability. Due to the well-defined random distribution of the planted clique problem, when certain conditions are met, we are able to generate unlimited amounts of training data with constant-time labelling. Our framework leverages this ability and uses polynomial-sized feed-forward neural networks for training on this data. We evaluate this framework empirically on multiple different planted clique distributions and manage to achieve high accuracy scores that surpass 90% in some cases.

We also address the possibility of expanding this framework to the maximum clique

problem. The set of graphs with n nodes is broken down into smaller subsets, and

for each subset, we train a network to detect graphs that contain a maximum clique

of at least size k. We assume that for each subset, an oracle machine exists that can

label some of the graphs in that subset in polynomial time. By conducting empirical

evaluations on small graphs, we prove this concept using an exponential-time algorithm to simulate the oracle machine. The results show that if we can approximate such an oracle machine in polynomial time, then the maximum clique problem can be effectively learned from the smaller subsets.