人工智能培训

搜索

机器学习论文:构建可重复的机器学习管道(Building a Reproducible Machine Learning Pipeline)

[复制链接]
kkkgame 发表于 2018-10-11 09:02:51 | 显示全部楼层 |阅读模式
kkkgame 2018-10-11 09:02:51 446 0 显示全部楼层
机器学习论文:构建可重复的机器学习管道(Building a Reproducible Machine Learning Pipeline)建模的再现性是任何机器学习实践者存在的问题,无论是在工业界还是学术界。不可再生模型的后果可能包括重大的财务成本,时间损失,个人声誉的损失(如果结果证明无法复制)。本文将首先讨论我们在构建各种机器学习模型时遇到的问题,并随后描述框架webuilt解决模型再现性的问题。该框架由四个主要组件(数据,特征,评分和评估层)组成,它们本身由明确定义的转换组成。这使我们不仅可以完全复制模型,还可以重用不同模型的变换。因此,该平台大大提高了离线和在线实验的速度,同时也确保了模型的可重复性。
Reproducibility of modeling is a problem that exists for any machine learningpractitioner, whether in industry or academia.The consequences of anirreproducible model can include significant financial costs, lost time, andeven loss of personal reputation (if results prove unable to be replicated).This paper will first discuss the problems we have encountered while building avariety of machine learning models, and subsequently describethe framework webuilt to tackle the problem of model reproducibility.The framework iscomprised of four main components (data, feature, scoring, and evaluationlayers), which are themselves comprised of well defined transformations.Thisenables us to not only exactly replicate a model, but also to reuse thetransformations across different models.As a result, the platform hasdramatically increased the speed of both offline and online experimentationwhile also ensuring model reproducibility.机器学习论文:构建可重复的机器学习管道(Building a Reproducible Machine Learning Pipeline) NC0QZUoyVjEvZql3.jpg
URL地址:https://arxiv.org/abs/1810.04570     ----pdf下载地址:https://arxiv.org/pdf/1810.04570    ----机器学习论文:构建可重复的机器学习管道(Building a Reproducible Machine Learning Pipeline)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

kkkgame当前离线
新手上路

查看:446 | 回复:0

快速回复 返回顶部 返回列表