人工智能培训

搜索

机器学习论文:通过集合进行主动异常检测:洞察力,算法和可解释性(Active Anomaly Detection via Ensembles: Insights

[复制链接]
jyzzx 发表于 2019-1-28 11:30:44 | 显示全部楼层 |阅读模式
jyzzx 2019-1-28 11:30:44 200 0 显示全部楼层
机器学习论文:通过集合进行主动异常检测:洞察力,算法和可解释性(Active Anomaly Detection via Ensembles: Insights, Algorithms, and  Interpretability)异常检测(AD)任务对应于从给定的一组数据实例中识别真实异常。 AD算法对数据实例进行评分并产生候选异常的排序列表,然后由人类分析以发现真正的异常。然而,当假阳性的数量非常高时,这个过程对于人类分析师来说可能是费力的。因此,在许多现实的AD应用中,包括计算机安全和防止欺诈,异常检测器必须由人类分析师配置,以最大限度地减少在本文中,我们研究了主动学习自动调谐异常探测器的问题,以最大限度地发现真正的异常。我们为此目标做出了四项主要贡献。首先,我们提出了一个重要的见解,解释了ADensembles的实际成功以及合奏如何自然适合主动学习。其次,我们提出了几种基于树的AD集合主动学习算法。这些算法帮助我们改善发现的异常的多样性,生成规则集以提高异常实例的可解释性,并以原则方式接受流数据设置。第三,我们提出了一种称为全局定位异常检测(GLAD)的算法,用于通用AD集合的主动学习。 GLAD允许最终用户通过使用标签反馈自动学习与特定数据实例的本地相关性来保留使用简单易懂的全局异常检测器。第四,我们进行了广泛的实验来评估我们的见解和算法。我们的结果表明,除了发现明显更多的异常而不是最先进的无监督基线之外,我们主动学习算法的流数据设置与批量设置相比具有竞争力。
Anomaly detection (AD) task corresponds to identifying the true anomaliesfrom a given set of data instances.AD algorithms score the data instances andproduce a ranked list of candidate anomalies, which are then analyzed by ahuman to discover the true anomalies.However, this process can be laboriousfor the human analyst when the number of false-positives is very high.Therefore, in many real-world AD applications including computer security andfraud prevention, the anomaly detector must be configurable by the humananalyst to minimize the effort onfalse positives.In this paper, we study the problem of active learning to automatically tuneensemble of anomaly detectors to maximize the number of true anomaliesdiscovered.We make four main contributions towards this goal.First, wepresent an important insight that explains the practical successes of ADensembles and how ensembles are naturally suited for active learning.Second,we present several algorithms for active learning with tree-based AD ensembles.These algorithms help us to improve the diversity of discovered anomalies,generate rule sets for improved interpretability of anomalous instances, andadapt to streaming data settings in a principled manner.Third, we present anovel algorithm called GLocalized Anomaly Detection (GLAD) for active learningwith generic AD ensembles.GLAD allows end-users to retain the use of simpleand understandable global anomaly detectors by automatically learning theirlocal relevance to specific data instances using label feedback.Fourth, wepresent extensive experiments to evaluate our insights and algorithms.Ourresults show that in addition to discovering significantly more anomalies thanstate-of-the-art unsupervised baselines, our active learning algorithms underthe streaming-data setup are competitive with the batch setup.机器学习论文:通过集合进行主动异常检测:洞察力,算法和可解释性(Active Anomaly Detection via Ensembles: Insights, Algorithms, and  Interpretability) Brq8V8YqTVk0l6uT.jpg
URL地址:https://arxiv.org/abs/1901.08930     ----pdf下载地址:https://arxiv.org/pdf/1901.08930    ----机器学习论文:通过集合进行主动异常检测:洞察力,算法和可解释性(Active Anomaly Detection via Ensembles: Insights, Algorithms, and  Interpretability)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

jyzzx当前离线
新手上路

查看:200 | 回复:0

快速回复 返回顶部 返回列表