人工智能培训

搜索

人工智能论文:具有动态流权重的非线性动态系统的视听说话人跟踪(Audiovisual Speaker Tracking using Nonlinear Dyna

[复制链接]
hjrinfo 发表于 2019-3-15 12:49:22 | 显示全部楼层 |阅读模式
hjrinfo 2019-3-15 12:49:22 941 0 显示全部楼层
人工智能论文:具有动态流权重的非线性动态系统的视听说话人跟踪(Audiovisual Speaker Tracking using Nonlinear Dynamical Systems with  Dynamic Stream Weights)数据融合在许多需要有效处理多模态感知观察的技术应用中起着重要作用。一个突出的例子是视听信号处理,它在自动语音识别,说话者本地化和相关任务中受到越来越多的关注。如果与声学信息适当地结合,额外的视觉提示可以帮助改善这些应用中的性能,尤其是在不利的声学条件下。基于瞬时传感器可靠性测量的声学和视觉流的动态加权是在这种情况下数据融合的有效方法。本文提出了一个框架,该框架扩展了已建立的非线性动力系统理论,其概念为任意数量的感官观测的动态流权重。它包括基于高斯滤波范例的递归状态估计器,其将动态流权重结合到与扩展卡尔曼滤波器密切相关的框架中。此外,提出了一种凸优化方法,用于在完全观察到的动态系统中使用Dirichlet先验的oracle动态流权重。这作为动态流权重估计器的通用参数学习框架的基础。所提出的系统是独立于应用程序的,可以很容易地适应特定的任务和要求。使用视听说话人跟踪任务的研究被认为是这项工作中的示例性应用。在实验中证明了基于动态流权重的估计框架的改进的跟踪性能。
Data fusion plays an important role in many technical applications thatrequire efficient processing of multimodal sensory observations.A prominentexample is audiovisual signal processing, which has gained increasing attentionin automatic speech recognition, speaker localization and related tasks.Ifappropriately combined with acoustic information, additional visual cues canhelp to improve the performance in these applications, especially under adverseacoustic conditions.A dynamic weighting of acoustic and visual streams basedon instantaneous sensor reliability measures is an efficient approach to datafusion in this context.This paper presents a framework that extends thewell-established theory of nonlinear dynamical systems with the notion ofdynamic stream weights for an arbitrary number of sensory observations.Itcomprises a recursive state estimator based on the Gaussian filtering paradigm,which incorporates dynamic stream weights into a framework closely related tothe extended Kalman filter.Additionally, a convex optimization approach toestimate oracle dynamic stream weights in fully observed dynamical systemsutilizing a Dirichlet prior is presented.This serves as a basis for a genericparameter learning framework of dynamic stream weight estimators.The proposedsystem is application-independent and can be easily adapted to specific tasksand requirements.A study using audiovisual speaker tracking tasks isconsidered as an exemplary application in this work.An improved trackingperformance of the dynamic stream weight-based estimation framework overstate-of-the-art methods is demonstrated in the experiments.人工智能论文:具有动态流权重的非线性动态系统的视听说话人跟踪(Audiovisual Speaker Tracking using Nonlinear Dynamical Systems with  Dynamic Stream Weights) e4mx344q4hXHMD4s.jpg
URL地址:https://arxiv.org/abs/1903.06031     ----pdf下载地址:https://arxiv.org/pdf/1903.06031    ----人工智能论文:具有动态流权重的非线性动态系统的视听说话人跟踪(Audiovisual Speaker Tracking using Nonlinear Dynamical Systems with  Dynamic Stream Weights)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

hjrinfo当前离线
新手上路

查看:941 | 回复:0

快速回复 返回顶部 返回列表