“The many ways to
understand the pixels”
The field of
computer vision is arguably seeing one of its most transformative changes in
recent history. Convolutional neural networks (CNNs) have revolutionized the
field, reaching super-human performance on some long-standing computer vision
tasks, such as image classification. The success of these networks is fueled by
massive amounts of human-labeled data. However this paradigm does not scale to
a deeper and more detailed understanding of images, as it is simply too hard to
collect enough human-labeled data. The issue is not that we humans don't
understand the image, but we often struggle to convey enough information to
successfully supervise a vision system.
In this talk I show
how computer vision can go beyond massive human supervision. This involves
designing better models that deal with fewer labels, exploiting easier and more
intuitive annotations, or coming up with novel optimizations to train deep
architectures with far fewer human annotations, or even without any at all.
I'll focus on three long standing computer vision problems: semantic
segmentation, intrinsic image decomposition and dense semantic correspondences.
Speaker:
Philipp Krähenbühl is
a postdoctoral researcher at UC Berkeley. He received a B.S. in Computer
Science from ETH Zurich in 2009, and a PhD in Computer Science from Stanford
University in 2014. Philipp's research interests lie in Computer vision,
Machine learning and Computer Graphics. He is particularly interested in deep
learning, efficient optimization techniques, and structured output prediction.
- Tags
-