|M.Sc Student||Amit Asaf|
|Subject||Learning to Cooperete with Application to Bridge|
|Department||Department of Computer Science||Supervisor||Professor Shaul Markovitch|
This work presents a new model-based framework for
decision making and learning in multi-agent systems in partially observable
environments. A common way for deciding what action to take is by performing a
look-ahead search and choosing the action with the highest expected utility.
There are, however, two major problems that need to be solved when applying
look-ahead search in such environments: the future actions of the other agents
are not known; and the effect of each action can not be predicted due to the
The thesis introduces a new general model-based decision-making algorithm that tackles these problems. The models are used to simulate the other agents, thus predicting their action selection. The models are also used to reduce the set of full states that are consistent with the agent actions. This reduced set is then used to perform Monte-Carlo sampling for evaluating the expected utility of each action.
The thesis also presents a learning framework for co-training of cooperative agents that use the above decision-making algorithm. The agents refine their selection strategies during training interaction
and continuously exchange their refined strategies. The refinement is based on inductive learning applied to examples accumulated for classes of states with conflicting actions.
We demonstrate the utility of this framework by applying it to the difficult problem of bridge
bidding. During an auction, each pair of bridge players is required to cooperate in order to compete with the opponent pair. The decision about which action to take is based on partial knowledge about the distribution of the cards.
Applying the co-training framework for this problem indeed demonstrated the effectiveness of this method. The pair of agents that co-trained significantly improved their bidding performance up to a level that is better than the current state-of-the-art bidding algorithm.