While the complexity of the datasets used in computer vision has steadily increased over the years, the supervision that vision systems receive from their human teachers has remained limited. In recent years, our vision community has seen increased interest in approaches that enrich the interaction between humans and machines for computer vision tasks, in order to allow communication beyond labels. Researchers have explored the role of the human teacher for visual recognition and search and expanded the channel over which human users can "teach" visual learning systems. For instance, in recognition, a human can give supervision or feedback to the system so it improves its predictions. There is also great value in enhancing the system-to-human direction of human-machine communication for visual recognition. Researchers have recently studied how to form sentences to explain images, and how to describe items in a way that is most natural to human users. There has also been work in using computer vision to visualize to a human user the system's perception of an image.

The goal of this workshop is to study approaches for allowing humans to provide richer supervision to visual learning systems, and to interactively give feedback to the system so it can learn better models for recognition or make more accurate predictions at test time. We are also interested in strategies for making the work of vision systems more interpretable for its human users. Related topics include:


We invite 4-page extended abstracts that study strategies for improving the communication (broadly defined) between a computer vision system and a human user, with applications to recognition and image/video retrieval.

We encourage submissions on both new unpublished work and on work that was previously published in a conference (including ECCV 2014) or a journal. We require 4-page abstracts in ECCV format by the submission deadline. Reviewing will be double-blind, or single-blind in the case of previously published work. We will give a best paper award to one of the accepted abstracts, and this work will be presented as an oral. Other accepted papers will be presented as posters with 3-minute spotlights. There will be no proceedings.

Submission link: Please submit your extended abstracts here by July 15.

Important Dates

July 15, 2014: Abstract submission deadline
July 28, 2014: Acceptance notification
September 7, 2014: Workshop

Tentative Program

9:00 - 9:10Introduction
9:10 - 9:40James Hays (Brown University)       [slides]
9:40 - 10:10Ashish Kapoor (Microsoft Research)
10:10 - 10:30Best abstract oral: Interactive Image Annotation with Visual Feedback, by Julia Moehrmann and Gunther Heidemann
10:30 - 11:00Coffee break
11:00 - 11:40Poster spotlights (2.5 min per poster)
11:40 - 12:40Posters
12:40 - 14:00Lunch
14:00 - 14:30Serge Belongie (Cornell Tech)       [slides]
14:30 - 15:00Larry Zitnick (Microsoft Research)       [slides]
15:00 - 15:30David Forsyth (UIUC)
15:30 - 16:00Coffee break
16:00 - 16:30Vlad Morariu (University of Maryland)
16:30 - 17:00Discussion and closing


1 Interactive Image Annotation with Visual Feedback
Julia Moehrmann and Gunther Heidemann
extended abstract
2 Towards Transparent Systems: Semantic Characterization of Failure Modes
Aayush Bansal, Ali Farhadi, and Devi Parikh
extended abstract      |      full paper
3 Transformative Crowdsourcing: Harnessing the Power of Crowdsourced Imagery to Simplify and Accelerate Investigations
Minwoo Park, TaeEun Choe, Andrew Scanlon, and Allison Beach
extended abstract
4 Learning Localized Perceptual Similarity Metrics for Interactive Categorization
Catherine Wah, Subhransu Maji, and Serge Belongie
extended abstract
5 Interactive Visualization based Active Learning
Mohammadreza Babaee, Stefanos Tsoukalas, Gerhard Rigoll, and Mihai Datcu
extended abstract
6 Attributes Make Sense on Segmented Objects
Zhenyang Li, Efstratios Gavves, Thomas Mensink, and Cees Snoek
full paper
7 Seeing What You're Told: Sentence-Guided Activity Recognition In Video
Siddharth Narayanaswamy, Andrei Barbu, and Jeffrey Siskind
full paper
8 Learning High-level Judgments of Urban Perception
Vicente Ordonez and Tamara Berg
full paper
9 Active Annotation Translation
Steven Branson and Pietro Perona
full paper
10 Interactive Object Counting
Carlos Arteta, Victor Lempitsky, Alison Noble, and Andrew Zisserman
full paper
11 Interactive Guiding Semi-Supervised Clustering via Attribute-based Explanations
Shrenik Lad and Devi Parikh
full paper
12 Zero-Shot Learning via Visual Abstraction
Stanislaw Antol, C. L. Zitnick, and Devi Parikh
full paper
13 Decorrelating Semantic Visual Attributes by Resisting the Urge to Share
Dinesh Jayaraman, Fei Sha, and Kristen Grauman
full paper
14 Attribute Adaptation for Personalized Image Search
Adriana Kovashka and Kristen Grauman
full paper
15 Inferring the Why in Images
Hamed Pirsiavash, Carl Vondrick, and Antonio Torralba
full paper


Program Committee