
奖励大小决定了强化学习的效率,这一成果由美国霍华德·休斯医学研究所Luke T. Coddington团队经过不懈努力而取得。该研究于2026年5月21日发表于国际一流学术期刊《科学》杂志上。
该课题组研究了在五种行为范式下,奖励大小如何影响初学小鼠的初始学习。特别是大的奖励可以通过对内部和跨会话学习和任务参与的分离效应显著提高学习效率。腹侧纹状体多巴胺释放的持续时间和幅度与奖励大小成正比,多巴胺奖励反应的长时间光遗传增强也复制了大部分(但不是全部)超大奖励对学习产生的好处。这些发现表明,动物的强化学习效率传统上被低估了,奖励的多巴胺信号调节任务参与程度与绝对奖励程度成比例。
据悉,标准的动物学习研究将个体奖励最小化,以最大限度地重复强化行为。
附:英文原文
Title: Reward magnitude determines reinforcement learning efficiency
Author: Sheng Gong, Alyssa Martell, Joshua T. Dudman, Luke T. Coddington
Issue&Volume: 2026-05-21
Abstract: Standard animal learning studies minimize individual reward magnitudes to maximize the repetitions of reinforced behaviors. We investigated how reward magnitude influences initial learning across five behavioral paradigms in nave mice. Especially large rewards could substantially improve learning efficiency through dissociable effects on within- and across-session learning and task engagement. The duration and magnitude of ventral striatal dopamine release scaled with reward sizes, and prolonged optogenetic enhancement of dopamine reward responses also reproduced much, but not all, of the benefits to learning produced by outsized rewards. These findings indicate that the reinforcement learning efficiency of animals has traditionally been underestimated and that dopamine signaling of rewards mediates task engagement in proportion to absolute reward magnitude.
DOI: aeb0813
Source: https://www.science.org/doi/10.1126/science.aeb0813
期刊信息
Science:《科学》,创刊于1880年。隶属于美国科学促进会,最新IF:63.714