人工智能培训

搜索

人工智能论文:学习独立可获得的奖励功能(Learning Independently-Obtainable Reward Functions)

[复制链接]
zwb521 发表于 2019-1-28 12:14:13 | 显示全部楼层 |阅读模式
zwb521 2019-1-28 12:14:13 626 0 显示全部楼层
人工智能论文:学习独立可获得的奖励功能(Learning Independently-Obtainable Reward Functions)我们提出了一种新的方法,用于学习一组与原始环境奖励相加并且被限制为独立可实现的解开奖励函数。我们根据价值函数来定义独立的可实现性,以实现一个学到的奖励,同时追求另一个学到的奖励。根据经验,我们说明我们的方法可以在各种领域中学习有意义的向前分解,并且这些分解在环境的奖励被修改时禁止某种形式的泛化性能。从理论上讲,我们得出的结果是关于最大化方法目标对最终奖励函数及其相应最优策略的影响。
We present a novel method for learning a set of disentangled reward functionsthat sum to the original environment reward and are constrained to beindependently achievable.We define independent achievability in terms of valuefunctions with respect to achieving one learned reward while pursuing anotherlearned reward.Empirically, we illustrate that our method can learn meaningfulreward decompositions in a variety of domains and that these decompositionsexhibit some form of generalization performance when the environment's rewardis modified.Theoretically, we derive results about the effect of maximizingour method's objective on the resulting reward functions and theircorresponding optimal policies.人工智能论文:学习独立可获得的奖励功能(Learning Independently-Obtainable Reward Functions) UmS4iM4eSk09eQwv.jpg
URL地址:https://arxiv.org/abs/1901.08649     ----pdf下载地址:https://arxiv.org/pdf/1901.08649    ----人工智能论文:学习独立可获得的奖励功能(Learning Independently-Obtainable Reward Functions)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

zwb521当前离线
新手上路

查看:626 | 回复:0

快速回复 返回顶部 返回列表