Date of Award
Doctor of Philosophy
An n-armed bandit task was used to investigate the trade-off between exploratory (choosing lesser-known options) and exploitive (choosing options with the greatest probability of reinforcement) human choice in a trial-and-error learning problem. In Experiment 1 a different probability of reinforcement was assigned to each of 8 response options using random-ratios (RRs), and participants chose by clicking buttons in a circular display on a computer screen using a computer mouse. Relative frequency thresholds (ranging from .10 to 1.0) were randomly assigned to each participant and acted as task constraints limiting the proportion of total responses that could be attributed to any response option. Preference for the richer keys was shown, and those with greater constraints explored more and earned less reinforcement. Those with the highest constraints showed no preference, distributing their responses among the options with equal probability. In Experiment 2 the payoff probabilities changed partway through, for some the leanest options increased to richest, and for others the richest became leanest. When the RRs changed, the decrease participants with moderate and low constraints showed immediate increases in exploration and change in preference to the new richest keys, while increase participants showed no increase in exploration, and more gradual changes in preference. For Experiment 3 the constraint was held constant at .85, and the two richest options were decreased midway through the task by varying amounts (0 to .60). Decreases were detected early for participants in all but the smallest decrease conditions, and exploration increased.
This dissertation is only
available for download to the SIUC community. Others should contact the
interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.