人工智能培训

搜索

论文代码开源:狙击手:高效的多尺度训练(SNIPER: Efficient Multi-Scale Training)

[复制链接]
admin 发表于 2018-6-24 05:44:49 | 显示全部楼层 |阅读模式
admin 2018-6-24 05:44:49 1798 0 显示全部楼层
人工智能论文代码开源:狙击手:高效的多尺度训练(SNIPER: Efficient Multi-Scale Training)请注意该人工智能论文代码开源在github,大部分是python写的,框架可能是tensorflow或者pytorch。我们提出SNIPER,一种在实例级视觉识别任务中执行高效多尺度训练的算法。 SNIPER不是处理图像金字塔中的每个像素,而是以适当的尺度处理围绕地面真实物质(称为芯片)的上下文区域。对于背景抽样,这些上下文区域是使用从短期学习计划培训的区域提案网络提取的提案生成的。因此,训练期间每个图像生成的码片的数量根据场景复杂度自适应地改变。与COCO数据集上800x1333像素的常用单尺度训练相比,SNIPER只能处理30%以上的像素。但它也可以观察来自图像金字塔极端分辨率的样本,如1400x2000像素。由于SNIPER对重新采样的低分辨率芯片(512x512像素)进行操作,即使使用ResNet-101主干,单个GPU上的批量大小也可能高达20。因此,它可以从批处理标准化培训中受益,而无需同步跨GPU的批量标准化统计。 SNIPER将像对象检测这样的实例级别识别任务的训练更接近图像分类的协议,并建议普遍接受的指导,即对于实例级视觉识别任务训练高分辨率图像可能不正确。基于带有ResNet-101 backbone的Faster-RCNN的实现在COCO数据集上获得了47.6%的mAP,用于边界框检测,并且每秒可以使用一个GPU处理5幅图像。该代码可在此https URL中找到
We present SNIPER, an algorithm for performing efficient multi-scale trainingin instance level visual recognition tasks.Instead of processing every pixelin an image pyramid, SNIPER processes context regions around ground-truthinstances (referred to as chips) at the appropriate scale.For backgroundsampling, these context-regions are generated using proposals extracted from aregion proposal network trained with a short learning schedule.Hence, thenumber of chips generated per image during training adaptively changes based onthe scene complexity.SNIPER only processes 30% more pixels compared to thecommonly used single scale training at 800x1333 pixels on the COCO dataset.But, it also observes samples from extreme resolutions of the image pyramid,like 1400x2000 pixels.As SNIPER operates on resampled low resolution chips(512x512 pixels), it can have a batch size as large as 20 on a single GPU evenwith a ResNet-101 backbone.Therefore it can benefit from batch-normalizationduring training without the need for synchronizing batch-normalizationstatistics across GPUs.SNIPER brings training of instance level recognitiontasks like object detection closer to the protocol for image classification andsuggests that the commonly accepted guideline that it is important to train onhigh resolution images for instance level visual recognition tasks might not becorrect.Our implementation based on Faster-RCNN with a ResNet-101 backboneobtains an mAP of 47.6% on the COCO dataset for bounding box detection and canprocess 5 images per second with a single GPU.The code is available atthis https URL论文代码开源:狙击手:高效的多尺度训练(SNIPER: Efficient Multi-Scale Training) hL0kH4C90g9GwgQB.jpg
URL地址:https://arxiv.org/abs/1805.09300v2     ----pdf下载地址:https://arxiv.org/pdf/1805.09300v2    ----         ----github下载地址:https://github.com/MahyarNajibi/SNIPER    ----    论文代码开源:狙击手:高效的多尺度训练(SNIPER: Efficient Multi-Scale Training)请注意该人工智能论文代码开源在github,大部分是python写的,框架可能是tensorflow或者pytorch,keras,至于具体是哪一个没有完全测试。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

admin当前离线
管理员

查看:1798 | 回复:0

快速回复 返回顶部 返回列表