CS 376: Computer Visioncomputer
    Spring 2018

        Tues/Thurs 3:30-5:00 pm
        GDC 2.216 (auditorium)

Jump to current week (lectures, assignments)        Course overview        Syllabus        Requirements/Grades        Piazza        Textbook        Deadlines

Instructor: Kristen Grauman
Office location: GDC 4.726
Office hours: Tues 2:30-3:30 pm and by appointment

TA: Thomas Crosley
Office location: TA Station #4, GDC
Office hours: Mon 3:30-4:30 pm and Wed 9-10 am

TA: Kapil Krishnakumar
Office location: TA Stations
Office hours: Mon 4:30-5:30 pm and Fri 3-4 pm

TA: Shubham Sharma
Office location: TA Station #4 GDC 1.302
Office hours: Tues/Thurs 5-6 pm

Please use Piazza for assignment help.


Course description

This is an intro course in computer vision.   It
is intended for upper-level undergraduate students.

Billions of images are on the web---how can you find the ones you are interested in?  How could photo collections on social media be indexed automatically by the people or events they contain?  How could we interact with a computer using natural gestures or facial expressions?  How can a robot identify objects in complex environments, or navigate uncharted territory?  After capturing video with a wearable camera for days on end, how to determine those snapshots worth keeping?  How can we develop augmented reality systems that overlay visualizations relevant to the real-world content in sight, e.g., a menu for the restaurant you just passed on the street, or a field guide entry for the unusual insect you encountered while hiking?

All such questions demand high-level computer visionIn computer vision, the goal is to develop methods that enable a machine to “understand” or analyze images and videos.   In this introductory vision course, we will explore fundamental topics in the field ranging from low-level feature extraction to high-level visual recognition. 

After covering the fundamentals for image processing, grouping, and multiple views, we will emphasize machine learning-based methods, especially for supervised learning and classification.  While we will motivate the concepts from the vision problems, the learning algorithms we will study are also useful tools for other domains in AI and beyond. 


A high-level summary of the syllabus is as follows:
I. Features and filters: low-level vision
II. Grouping and fitting: mid-level vision
III. Multiple views
IV.  Recognition: high-level vision

The course textbook is:

    Computer Vision: Algorithms and Applications, by Rick Szeliski.

It is freely available online or may be purchased in hardcopy.  Course lecture slides will be posted below and are also a useful reference.

You may also find the following books useful.

Schedule (cumulative to date)

Readings and links

Thurs Jan 18
Course intro

Textbook Sec 1.1-1.3

Course requirements

UTCS account setup

Basic Matlab tutorial

Running Matlab at UT
A0 out, due Tues Jan 23

See optional Latex info

Tues Jan 23 Features and filters Sec 3.1.1-2, 3.2 Linear filters
slides pdf
slides ppt

Thurs Jan 25
Sec 3.2.3, 4.2

Seam carving paper

Seam carving video
Gradients, edges

slides pdf
slides ppt
A1 out, due Fri Feb 9

See optional Latex templates

Tues Jan 30

Sec 3.3.2-4 Binary image analysis

slides pdf
slides ppt

Thurs Feb 1
Sec 10.5

Texture Synthesis 

Texture synthesis by non-parametric sampling, Efros & Leung

Video textures

Style transfer for video

Evaluating texture synthesis

slides pdf
slides pptx

Tues Feb 6
Sec 8.4 (up until 8.4.1)

Motion magnification
Optical flow

slides pdf
slides ppt

grouping Thurs Feb 8 Grouping and fitting Sec 4.3.2

Hough transform line video
Hough transform

slides pdf
slides ppt

Tues Feb 13

Hough transform

slides pdf
slides ppt

Thurs Feb 15
Sec 5.1.1 Deformable contours

slides pdf
slides ppt
A2 out, due Friday Mar 2

Tues Feb 20
Sec 5.2-5.4 Segmentation

slides pdf
slides ppt

Thurs Feb 22

Vision/graphics talk by Katie Bouman, MIT: "Imaging the Invisible".  GDC auditorium, Tues Feb 27, 11 am.
Tues Feb 27 Multiple views Sec 4.1 Local invariant features: detection

slides pdf
slides ppt

Thurs Mar 1
Lowe SIFT paper

VLFeats library

Affine covariant features code
Local invariant features: description and matching

slides pdf
slides ppt
Practice midterm handout in class

Tues Mar 6
Sec 2.1.1, 2.1.2, 6.1.1, 6.1.4


slides pdf
slides ppt

Thurs Mar 8

Midterm exam

Tues Mar 20
Sec 3.6.1

HP frames video 1
HP frames video 2
Homography and image warping

slides pdf
slides ppt
A3 posted, Mar 19

Vision job talk, Tues 11 am in GDC auditorium, Shuran Song, Princeton,
Seeing the Unseen: Data-Driven 3D Scene Understanding for Robot Vision

Thurs Mar 22
Sec 11.1.1, 11.2-11.5 Stereo, part 1

slides pdf
slides ppt

Tues Mar 27
Audio camera

Graph cuts stereo matching demo

Middlebury stereo database

DeepStereo, Flynn et al.

Object labeling in RGB-D videos, Lai et al.

Body shape and pose from RGBD - Bogo et al.
Stereo, part 2

slides pdf
slides ppt

Thurs Mar 29
Synthesis Ch 4, 5, 6 (pdf on Canvas)

Szeliski 14.3

Video Google demo by Sivic et al., paper

David Lowe's SIFT and Generalized Hough approach (Lowe, IJCV 2004)

Google Goggles
Instance recognition

slides pdf
slides ppt

Tues April 3
Stanford Mobile Visual Search Data Set, Chandrasekhar et al.


Astrometry.net: Blind astrometric calibration of arbitrary astronomical images.  Lang et al.
Instance recognition

slides pdf
slides ppt
A4 out, due April 17

Vision job talk: Saraubh Gupta, UC Berkeley, 11 am GDC auditorium; "Visual Perception and Navigation in 3D Scenes"

Thurs April 5
Sequence to sequence: video to text, Venugopalan et al.
Guest lecture, Prof. Ray Mooney

Language + vision: Video captioning

Tues April 10
Synthesis (pdf on Canvas)

Geometric Min-Hash, Chum et al.
Mining for objects

slides pdf
slides ppt

Thurs April 12

Viola-Jones face detection paper Intro to category recognition

Face detection with boosting

slides pdf
slides ppt

Tues April 17
Burges SVM tutorial

Dalal-Triggs pedestrian detection paper
Classifier cascades

Object proposals

slides pdf
slides ppt
A4 due

A5 out, due May 1 Wed. May 2.

Vision job talk, Hanbyul Joo, CMU: Social signal processing: A computational approach to sensing, reconstructing, and understanding social interaction.  11 am in GDC auditorium.

Thurs April 19
Burges SVM tutorial

Hays-Efros im2gps paper

Lazebnik et al. Spatial pyramids paper

Vondrick et al. Hoggles paper

Dalal-Triggs pedestrian detection paper

SVM demo
Support vector machines

Nearest neighbor

Evaluation metrics

slides pdf
slides ppt

Tues April 24
Convolutional neural networks for visual recognition (Stanford)

Krizhevsky et al. Imagenet classification with deep convolutional neural networks paper

Clarifai demo
Neural networks


slides pdf
slides ppt

Thurs April 26
Relative attributes

Visual recognition with humans in the loop

UT Zappos 50K Dataset

WhittleSearch project

Visipedia project

Merlin bird app
iNaturalist app


Fine-grained recognition

slides pdf
slides ppt

Tues May 1
Fully convolutional networks for semantic segmentation

FusionSeg for video segmentation

Click Carving for interactive video segmentation
Guest lecture, Dr. Suyog Jain

Semantic segmentation

Interactive image and video segmentation

CNNs for segmentation

slides pdf
slides ppt

Thurs May 3

Course wrap-up and applications

Final exam is Thurs May 10, 2-5 pm


Basic knowledge of probability and linear algebra; data structures, algorithms; programming experience.  Previous experience with image processing will be useful but is not assumed. 

Assignments will consist largely of Matlab programming problems.  There will be a warm-up assignment to get familiar with basic Matlab commands.  We will recommend useful functions to check out per assignment.  However, students are expected to practice and pick up Matlab on their own in order to complete the assignments.  The instructor and TAs are happy to help with Matlab issues during office hours and via Piazza. 

If you are unsure if your background is a good match for this course, please come talk to the instructor.

Course requirements
Assignments:  Assignments will be given approximately every two weeks.  The programming problems will provide hands-on experience working with techniques covered in or related to the lectures.  All code and written responses must be completed individually.  Most assignments will take significant time to complete.  Please start early, and use Piazza and/or see us during office hours for help if needed.    Please follow instructions in each assignment carefully regarding what to submit and how to submit it.

Extension policy: If you turn in your assignment late, expect points to be deducted. Extensions will be considered on a case-by-case basis, but in most cases they will not be granted.  The greater the advance notice of a need for an extension, the greater the likelihood of leniency.  For programming assignments, by default, 10 points (out of 100) will be deducted for lateness for each day late.  We will use the submission program timestamp to determine time of submission.  One day late = from 1 minute to 24 hours past the deadline.  Two days late = from 24 hours and 1 minute to 48 hours past the deadline.  We will not accept assignments more than 4 days late, or once solutions have been discussed in class, whichever is sooner.

Exams:  There is an in-class midterm and a comprehensive final exam.  Both exams will be offered at the listed time only.  The registrar will set our final exam date, which according to the published UT academic calendar could be as late as May 15 this year.  Please account for this when making your summer plans.  Neither exam will be offered at a different time to accommodate personal travel plans, internship start dates, interviews, etc.

Participation/attendance:  Regular attendance is expected.  If for whatever reason you are absent, it is your responsibility to find out what you missed that day.  Note that attendance does factor into the final grade.  (See Section II of the UTCS Code of Conduct regarding attendance expectations.) 

General responsibilities:  Beyond the above, your responsibilities in the class are:

Important Dates

Please note the following important dates and deadlines.

Assignments are due about every two weeks.  The assignment deadlines below are tentative and are provided to help your planning.  They are subject to minor shifts if the lecture plan needs to be adjusted slightly according to our pace in class.

Grading Policy

Grades will be determined as follows.  You can check your current grades online using Canvas.

Academic Dishonesty Policy

You are encouraged to discuss the readings and concepts with classmates. However, all written work and code must be your own. All work ideas, quotes, and code fragments that originate from elsewhere must be cited according to standard academic practice.

Students caught cheating will automatically fail the course.  The case will also be reported to the Office of the Dean of Students, which may institute its own disciplinary measures. If in doubt, look at the departmental guidelines and/or ask.

Notice about Students with Disabilities

The University of Texas at Austin provides upon request appropriate academic accommodations for qualified students with disabilities. To determine if you qualify, please contact the Dean of Students at 471-6529; 471-4641 TTY. If they certify your needs, I will work with you to make appropriate arrangements.

Notice about Missed Work Due to Religious Holy Days

A student who misses an examination, work assignment, or other project due to the observance of a religious holy day will be given an opportunity to complete the work missed within a reasonable time after the absence, provided that he or she has properly notified the instructor. It is the policy of the University of Texas at Austin that the student must notify the instructor at least fourteen days prior to the classes scheduled on dates he or she will be absent to observe a religious holy day. For religious holy days that fall within the first two weeks of the semester, the notice should be given on the first day of the semester. The student will not be penalized for these excused absences, but the instructor may appropriately respond if the student fails to complete satisfactorily the missed assignment or examination within a reasonable time after the excused absence.

Latex Templates for Assignment Write-ups (Optional)

You may use any tool for preparing assignment write-ups that you like, so long as it is organized and clear.  Typically we ask for a mix of descriptions/explanations as well as embedded figures composed of images and/or plots produced in Matlab.

Below we provide some info about using Overleaf, a free online editor for Latex.  Overleaf provides various Latex templates and compiles your edited .tex files into a pdf automatically.  The basics:

    1) go to overleaf.com
    2) sign up/sign in

    3) click new project on the left

    4) scroll down to "Homework Assignment" and click on "more homework assignment templates"

    5) choose whichever template you feel comfortable with and click "open as template"

    6) start editing

    7) once you are done editing, click "PDF" in the panel above. A pdf file will be generated and downloaded automatically.

Here are instructions about inserting images.

How to position images.

Captioning, scaling, resizing.