The Missing Link Between Memory and Reinforcement Learning
Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only
