M.Sc Student Costa Areej Exact Learning of Juntas from Membership Queries Department of Computer Science Professor Nader Bshouty Abstract

This work addresses the problem of learning Boolean functions in the presence of irrelevant attributes. The learning model is the model of exact learning from membership queries. A function f is said to be a Boolean function on n variables if f : {0,1}n → {0,1}. A membership query a is a Boolean assignment of length n.

In this model we have a teacher and a learner. The teacher holds a hidden target function f from a specific known class of Boolean functions C. The learning process is done by asking the teacher membership queries. The teacher answers each membership query a by f(a). The learner’s goal is to learn the exact target function f, using a minimum number of queries and a small time complexity.

For an assignment a{0,1}n , i{1,...,n} and ξ{0,1} we denote by a|xi←ξ the assignment (a1,...,ai−1,ξ,ai,...,an). We say that the variable xi is relevant in f if there is an assignment a such that f(a|xi←0)≠ f(a|xi←1). d-Junta is the set of all Boolean functions on n variables with at most d relevant variables. In our research, we focus on the fundamental problem of learning d-Junta. We distinguish in our work between non-adaptive and adaptive learning of d-Junta. We also distinguish between deterministic and randomized learning.

We use new techniques (such as Yao’s minimax principle) to find new bounds, narrow some of the gaps between the lower and upper bounds and find new non-adaptive and adaptive algorithms with small query and time complexities, both for the deterministic case and the randomized case. Some of the bounds are tight in the sense that finding better ones either gives a breakthrough result in some long-standing combinatorial open problem or needs a new technique that is beyond the existing ones. We introduce in the following our main results.

For deterministic non-adaptive learning, we give a new nO(d) time algorithm that improves previous results and provides a new upper bound as well. We also give a new algorithm that runs in time poly(dd,n). For poly(2d,n) time algorithms, we describe an algorithm with an improved query complexity. For randomized non-adaptive learning, we manage to obtain the same lower bound of the deterministic case. We also introduce new randomized algorithms for poly(2d,n,log 1/δ) time complexity, where δ is the failure probability. In addition, we give a new reduction for randomized non-adaptive algorithms. We use this reduction to improve our results, but it can be also applied to other different problems as well.

For deterministic adaptive learning, we give a new result for poly(dd,n) time algorithms with improved query complexity. We also introduce a new lower bound for randomized adaptive learning. Furthermore, we describe three new randomized adaptive algorithms. The first one runs in poly(dd,n,log 1/δ) time. The second and third algorithms run in poly(2d,n,log 1/δ) time.