Meets: Wednesdays 1-4 pm in
GDC 2.502
Instructor: Kristen Grauman
Office: GDC 4.726
Office hours: by appointment (send email)
TA: Kai-Yang Chiang
Office: GDC 4.802D
Office hours: Thursday 10:30 am-12:30 pm
Please use Piazza for
assignment questions.
Topic:
This is a graduate seminar course in computer
vision. We will survey and discuss current
vision papers relating to visual recognition
(primarily of objects and object
categories), auto-annotation of images, and scene
understanding. The goals of the course will be to
understand current approaches to some important problems,
to actively analyze their strengths and weaknesses, and to
identify interesting open questions and possible
directions for future research.
See the syllabus for an outline
of the main topics we'll be covering.
Requirements:
Students will be responsible for:
- writing two paper reviews each week
- posting two short review summaries/discussion points on the course discussion board (Piazza)
- participating in discussions during class
- completing two programming assignments
- presenting ~twice in class (details depending on final enrollment)
- completing a project with a partner
Note that presentations are due
one week before
the slot your presentation is scheduled. This means
you will need to read the papers, prepare experiments,
create slides, etc. more than
one week before the date you are signed up for.
The idea is to meet and discuss ahead of time, so that we
can iterate as needed the week leading up to your
presentation. Please coordinate
in advance with the other student presenters on your day
to ensure that no single paper receives 2 experiments or
2 paper presentations.
More details on the requirements
and grading breakdown are here.
Prereqs:
Courses in computer vision and/or machine learning (CS 376 Computer Vision and/or CS
391 Machine Learning, or similar); ability to understand
and analyze conference papers in this area; programming
required for experiment presentations and projects.
Please
talk to me if you are unsure if the course is a good match for
your background. I generally recommend scanning through
a few papers on the syllabus to gauge what kind of background
is expected. I don't assume you are already familiar
with every single algorithm/tool/image feature a given paper
mentions, but you should feel comfortable following the key
ideas.
Date |
Topics |
Papers and links |
Presenters/slides |
Items due |
Jan 20 |
Course intro |
slides |
||
Jan 27 |
No class |
Topic preferences due via
email to Kai by Wed Jan 27. Write "CS381V" in the
subject line. |
||
Feb 3 |
Instance recognition Invariant local features, local feature matching, instance recognition, visual vocabularies and bag-of-words, large-scale mining image credit: Andrea Vedaldi and Andrew Zisserman |
|
slides outline |
Coding
assignment 1 out, due Friday Feb 19. |
Feb 10 |
Category
recognition Image descriptors, classifiers, support vector machines, nearest neighbors, convolutional neural networks, large-scale image collections Image credit: ImageNet |
|
slides
1 Intro to categorization and case studies of discriminative models slides 2 handout slides 2 with links Guest lecture on CNNs, Dinesh Jayaraman |
Monday Feb 15, 5-7 pm: Hands on
CNN/Caffe tutorial, by Dinesh Jayaraman and Yu-Chuan
Su. GDC 4.302 (not the usual classroom) Tutorial slides Tutorial code |
Feb 17 |
Mid-level
representations Segmentation into regions, contours, grouping, video segmentation, category-independent object proposals, 3d structure Image credit: Pablo Arbelaez et al. |
|
slides Paper-Chun-Chen Kuo Paper-Andrew Sharp Expt-Kim Houck Expt-Chad Voegele |
Coding assignment 2 out Monday Feb 22, due Wed March 9 (with follow up due Thurs March 10) |
Feb 24 |
Object
detection Localizing objects within an image, efficient search, part-based models, semantic segmentation, voting, context, objects in scenes Image credit: Felzenszwalb et al. |
|
slides Paper-Richard Teammco Paper-Huihuang Zheng Expt-Adam Allevato Expt-William Xie |
Tuesday March 1, 11 am: UTCS
Distinguished Lecture by Prof.
Jim Rehg, Georgia Tech. GDC Auditorium |
Mar 2 |
Attributes
and parts Visual properties, learning from natural language descriptions, intermediate shared representations Image credit: Lampert et al. |
|
Paper-Ruohan
Gao Paper-Akanksha Saran Paper-Zhuode Liu Expt-Aishwarya Padmakumar Expt-Abhishek Sinha Expt-Ashwini Venkatesh |
Thursday March 4, 11 am: Talk
by Aditya
Khosla, MIT. GDC Auditorium Tuesday March 8, 11 am: Talk by Philipp Krahenbuhl, UC Berkeley. GDC Auditorium |
Mar 9 |
Language
and vision Image credit: Antol et al. |
|
Paper-Tyler
Folkman Paper-Edward Banner Paper-Surbhi Goel Expt-Huihuang Zheng Expt-Kunal Lad Guest speaker: Subha Venugopalan |
Project
proposal and paper guidelines Tuesday March 22, 11 am: Talk by David Fouhey, CMU. GDC Auditorium |
Mar 16 |
No class - spring
break |
|||
Mar 23 |
Low-supervision
learning Feature learning, semantics learning. Leveraging free or nearly free cues for supervision. Internet data, video, egomotion, context... Image credit: X. Chen et al. |
|
Paper-Hilgad
Montelo Paper-Chad Voegele Paper-Bo Xiong Expt-Ashish Bora Expt-Ruohan Gao |
|
Mar 30 |
Great
outdoors Linking and visualizing multi-view data from tourist photos, image-based geolocalization, natural scene text detection, discovering correlated non-visual properties in street-side imagery Image credit: T-Y. Lin et al. |
|
Paper-Manu
Agarwal Paper-Kunal Lad Expt-Zhuode Liu Expt-Ruohan Zhang Expt-Richard Teammco |
|
April 6 |
3d
scenes and objects 3d structure (single views, panoramas, RGBD) and scene layout for visual recognition Image credit: Y. Xiang et al. |
|
Paper-Adam
Allevato Paper-William Xie Expt-Hilgad Montelo Expt-Chun-Chen Kuo Expt-Andrew Sharp |
|
April 13 |
Recognition
in action Learning how to move for recognition, manipulation. 3D objects and the next best view. Image credit: Malmir et al. |
|
Paper-Aishwarya
Padmakumar Paper-Ruohan Zhang Paper-Abhishek Sinha Expt-Manu Agarwal Expt-Yinan Zhao |
|
April 20 |
Noticing
and remembering Predicting what gets noticed or remembered in images and video. Saliency, importance, memorability, photography biases. Image credit: T. Liu et al. |
|
Paper-Kim Houck Paper-Ashish Bora Expt-Bo Xiong Expt-Akanksha Saran Expt-Tyler Folkman |
|
April 27 |
Social
signals Cues from people in images: body pose, social groups and roles, attention, gaze following, scene structure Image credit: Khosla et al. |
|
Paper-Yinan
Zhao Paper-Ashwini Venkatesh Expt-Surbhi Goel Expt-Edward Banner |
Note April 27/29 deadlines
for free poster printing at UTCS See Piazza post for details |
May 4 |
Final project presentations in class |
See poster presentation
instructions on Piazza. |
Final papers and poster reviews due
Friday May 6 |