首页 > 资料专栏 > IT > IT技术 > 硬件技术 > 计算机视觉这一年英文版

计算机视觉这一年英文版

精品视觉
V 实名认证
内容提供者
热门搜索
计算机视觉
资料大小:2001KB(压缩后)
文档格式:WinRAR
资料语言:中文版/英文版/日文版
解压密码:m448
更新时间:2019/1/24(发布于北京)
阅读:3
类型:积分资料
积分:10分 (VIP无积分限制)
推荐:升级会员

   点此下载 ==>> 点击下载文档


文本描述
AYearinComputerVision
TheMTank
Website: http://themtank/
Contact: info@themtank
Note: This document is intended for educational purposes only. Any information
contained within is representative of the editors professional views. This piece
contains a number of academic publications for which references are provided where
appropriate.
Edited for The M Tank by
Benjamin F. Duffy
&
Daniel R. Flynn
A Year in Computer Vision: The M Tank, 2017
Table of Contents
Introduction3
Part One: Classification/Localisation, Object Detection, Object Tracking5
Classification/Localisation5
Object Detection8
Object Tracking12
Part Two: Segmentation, Super-res/Colourisation/Style Transfer, Action
Recognition14
Segmentation14
Super-resolution, Style Transfer & Colourisation17
Action Recognition23
Part Three: Toward a 3D understanding of the world24
Other uncategorised 3D33
In summation36
Part Four: ConvNet Architectures, Datasets, Ungroupable Extras38
ConvNet Architectures38
Datasets 46
Ungroupable extras and interesting trends50
Conclusion 55淘宝店铺
“Vivian研报”
首次收集整理
获取最新报告及后续更新服务请在淘宝搜索店铺“Vivian研报”
或直接用手机淘宝扫描下方二维码
A Year in Computer Vision: The M Tank, 2017
Introduction
Computer Vision typically refers to the scientific discipline of giving machines the ability
of sight, or perhaps more colourfully, enabling machines to visually analyse their
environments and the stimuli within them. This process typically involves the evaluation
of an image, images or video. The British Machine Vision Association (BMVA) defines
Computer Vision as “
the automatic extraction, analysis and
understanding
of useful
information from a single image or a sequence of images.
The term
understanding
mechanical definition of vision, one which serves to demonstrate both the significance
and complexity of the Computer Vision field. True understanding of our environment is
not achieved through visual representations alone. Rather, visual cues travel through
the optic nerve to the primary visual cortex and are interpreted by the brain, in a highly
stylised sense. The interpretations drawn from this sensory information encompass the
near-totality of our natural programming and subjective experiences, i.e. how evolution
has wired us to survive and what we learn about the world throughout our lives.
In this respect,
vision
computing
multitude of the brain’s faculties. Hence, many believe that Computer Vision, a true
understanding of visual environments and their contexts, paves the way for future
iterations of Strong Artificial Intelligence, due to its cross-domain mastery.
However, put down the pitchforks as we’re still very much in the embryonic stages of
this fascinating field. This piece simply aims to shed some light on 2016’s biggest
Computer Vision advancements. And hopefully ground some of these advancements in
a healthy mix of expected near-term societal-interactions and, where applicable,
tongue-in-cheek prognostications of the end of life as we know it.
While our work is always written to be as accessible as possible, sections within this
particular piece may be oblique at times due to the subject matter. We do provide
rudimentary definitions throughout, however, these only convey a facile understanding
of key concepts. In keeping our focus on work produced in 2016, often omissions are
made in the interest of brevity.
One such glaring omission relates to the functionality of Convolutional Neural Networks
(hereafter CNNs or ConvNets), which are ubiquitous within the field of Computer Vision.
1 British Machine Vision Association (BMVA). 2016. What is computer vision
[Online]
bmva/visionoverview [Accessed 21/12/2016]A Year in Computer Vision: The M Tank, 2017
The success of AlexNet in 2012, a CNN architecture which blindsided ImageNet 2
competitors, proved instigator of a de facto revolution within the field, with numerous
researchers adopting neural network-based approaches as part of Computer Vision’s
new period of ‘normal science’.3
Over four years later and CNN variants still make up the bulk of new neural network
architectures for vision tasks, with researchers reconstructing them like legos; a working
testament to the power of both open source information and Deep Learning. However,
an explanation of CNNs could easily span several postings and is best left to those with
a deeper expertise on the subject and an affinity for making the complex
understandable.
For casual readers who wish to gain a quick grounding before proceeding we
recommend the first two resources below. For those who wish to go further still, we
have ordered the resources below to facilitate that:
●What a Deep Neural Network thinks about your #selfie from Andrej Karpathy
is one of our favourites for helping people understand the applications and
functionalities behind CNNs.4
●Quora: “what is a convolutional neural network” - Has no shortage of great
links and explanations. Particularly suited to those with no prior understanding. 5
●CS231n: Convolutional Neural Networks for Visual Recognition from
Stanford University is an excellent resource for more depth. 6
●Deep Learning (Goodfellow, Bengio & Courville, 2016) provides detailed
explanations of CNN features and functionality in Chapter 9. The textbook has
been kindly made available for free in HTML format by the authors. 7
For those wishing to understand more about Neural Networks and Deep Learning in
general we suggest:
2 Krizhevsky, A., Sutskever, I. and Hinton, G. E. 2012. ImageNet Classification with Deep Convolutional
Neural Networks,
NIPS 2012: Neural Information Processing Systems
cs.toronto/~kriz/imagenet_classification_with_deep_convolutional.pdf
3 Kuhn, T. S. 1962.
The Structure of Scientific Revolutions
Chicago Press.
4 Karpathy, A. 2015. What a Deep Neural Network thinks about your #selfie.
[Blog]
Andrej Karpathy Blog
Available: http://karpathy.github.io/2015/10/25/selfie/ [Accessed: 21/12/2016]
5 Quora. 2016. What is a convolutional neural network
[Online]
https://quora/What-is-a-convolutional-neural-network [Accessed: 21/12/2016]
6 Stanford University. 2016. Convolutional Neural Networks for Visual Recognition.
[Online] CS231n
Available: http://cs231n.stanford/ [Accessed 21/12/2016]
7 Goodfellow et al. 2016. Deep Learning.
MIT Press
[Online]
[Accessed: 21/12/2016] Note: Chapter 9, Convolutional Networks [Available:
deeplearningbook/contents/convnets.html]。。。以上简介无排版格式,详细内容请下载查看