טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentGuy Hochman
SubjectPartial Reinforcement Extinction Effect: Boundaries and
Limitations
DepartmentDepartment of Industrial Engineering and Management
Supervisor Full Professor Erev Ido
Full Thesis textFull thesis text - English Version


Abstract

The partial reinforcement extinction effect (PREE) is one of the best examples of a basic behavioral phenomenon, detected in the laboratory, with nontrivial practical implications.  It implies that learning under partial reinforcements is more robust than learning under full reinforcements.  Unfortunately, field studies fail to support this interesting prediction.  The current research tries to offer a resolution for this apparent inconsistency between basic laboratory and field research.  We believe it stems from an insufficient account of the payoff variability effect, a theoretically similar but practically contradicting effect. That is, outcome variability has two contradicting effects: payoff variability effect (a negative effect which slows training) and PREE (a positive effect, which slows extinction). Thus, a better understanding of the relative importance of these two contradicting effects would facilitate a better use of reinforcement schedules in order to maximize people's performances. Four experiments were designed to address this goal. The first experiment replicates the classical demonstration of the PREE. In addition, it demonstrates that the effect of reinforcement schedule is highly sensitive to the evaluation criteria. The remaining experiments compare alternative hypotheses concerning the conditions that determine the relative importance of the two effects. The results support a “conditional confusion” hypothesis: Subjects tend to prefer the behavior that led to the best outcome in similar conditions in the past. This finding implies that partial reinforcements are effective only when the advantage of the promoted behavior is large enough; that is when there is a distinct difference between the attractiveness of the promoted and the alternative behaviors, in favor of the promoted behavior. Thus, the conditional confusion hypothesis refines the proposed solution: it implies a cognitive mechanism that produces the two contradicting effects of outcome variability. These results can be captured with a simple learning model that assumes conditional probability matching. This model highlights the boundaries of the overall effect of the reinforcement schedule.