The Multi-Arm Bandit
The Multi-Arm Bandit
You are given two biased coins, but you do not know the bias of either coin. There is a reward each time you successfully get heads, with the amount of the reward decreasing with each coin flip. This Demonstration shows the outcomes of a series of flips, each made by selecting a coin with one of the two choice buttons. The outcomes determine the total reward and provide information on the biases of the two coins, which may guide future choices. A good coin-picking strategy accumulates the greatest reward by balancing the need to explore which coin is more likely to succeed with exploiting the coin that seems best so far.