人工智能培训

搜索

机器学习论文:利用多智能体奖励增强模拟学习模拟人体驾驶行为的突现特性(Simulating Emergent Properties of Human Drivi

[复制链接]
bigrc 发表于 2019-3-15 13:19:45 | 显示全部楼层 |阅读模式
bigrc 2019-3-15 13:19:45 167 0 显示全部楼层
机器学习论文:利用多智能体奖励增强模拟学习模拟人体驾驶行为的突现特性(Simulating Emergent Properties of Human Driving Behavior Using  Multi-Agent Reward Augmented Imitation Learning)多智能体模仿学习的最新发展已经显示出对人类驾驶员行为进行建模的有希望的结果。然而,捕获在现实世界数据集中观察到的紧急交通行为具有挑战性。这种行为是由于代理之间的许多本地交互而产生的,这些交互在模仿学习中并不常见。本文提出了RewardAugmented Imitation Learning(RAIL),它将奖励增强功能集成到多智能体模仿学习框架中,并允许设计者以原则方式指定原型知识。我们证明了模拟学习过程的收敛保证在后续增强的应用下得以保留。该方法在驾驶场景中得到验证,其中通过使用我们提出的算法学习的驾驶策略来控制所有交通场景。此外,与传统的模仿学习算法相比,我们证明了在单个代理的本地操作和复杂的多代理设置中的紧急属性行为方面的性能改进。
Recent developments in multi-agent imitation learning have shown promisingresults for modeling the behavior of human drivers.However, it is challengingto capture emergent traffic behaviors that are observed in real-world datasets.Such behaviors arise due to the many local interactions between agents that arenot commonly accounted for in imitation learning.This paper proposes RewardAugmented Imitation Learning (RAIL), which integrates reward augmentation intothe multi-agent imitation learning framework and allows the designer to specifyprior knowledge in a principled fashion.We prove that convergence guaranteesfor the imitation learning process are preserved under the application ofreward augmentation.This method is validated in a driving scenario, where anentire traffic scene is controlled by driving policies learned using ourproposed algorithm.Further, we demonstrate improved performance in comparisonto traditional imitation learning algorithms both in terms of the local actionsof a single agent and the behavior of emergent properties in complex,multi-agent settings.机器学习论文:利用多智能体奖励增强模拟学习模拟人体驾驶行为的突现特性(Simulating Emergent Properties of Human Driving Behavior Using  Multi-Agent Reward Augmented Imitation Learning) W2VtL2latGHoOMaV.jpg
URL地址:https://arxiv.org/abs/1903.05766     ----pdf下载地址:https://arxiv.org/pdf/1903.05766    ----机器学习论文:利用多智能体奖励增强模拟学习模拟人体驾驶行为的突现特性(Simulating Emergent Properties of Human Driving Behavior Using  Multi-Agent Reward Augmented Imitation Learning)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

bigrc当前离线
新手上路

查看:167 | 回复:0

快速回复 返回顶部 返回列表