Computer Vision and Deep Learning

(2018 Undergraduate Course)

Instructor: Prof. Mu Yadong (email: myd@pku.edu.cn)
Location: Room 319, Building 2 (319)
Time: Friday 15:10pm - 17:00pm (every week)
TA: Chi Lu (chilu@pku.edu.cn), Liu Chenchen (liuchenchen@pku.edu.cn), Weng Xinyu (wengxy@pku.edu.cn)
Office Hour: Thursday 4-6pm, Friday 10-12am, ICST Peking University (128 Zhong-Guan-Cun North Road)
March 2, 2018 Introduction
  • Course Introduction
  • Overview of Computer Vision and Deep Learning
March 9, 2018 Topic: Visual Recognition
  • Visual Recognition: Task Definition and Challenges
  • Visual Features: Harris Corner, SIFT, MSER, HOG etc.
  • Bag-of-words Models
  • Spatial Pyramid Matching
  • Pyramid Match Kernel
  • Vocabulary Tree
  • Sparse Coding
March 16, 2018 Topic: Visual Recognition
  • Deep Learning for Visual Recognition: LeNet-5, AlexNet, VGG-16, GoogleNet, ResNet
  • Network Visualizatioin
March 23, 2018 Topic: Object Detection
  • V-J Face Detector (Integral Image, AdaBoost, Cascade)
  • HOG+SVM with NMS
  • Deformable Part Model (DPM) for Pedestrian Detection
  • Assignment 1: Image Classification (deadline: 4/23 midnight)
March 30, 2018 Topic: Object Detection
  • R-CNN
  • Fast R-CNN
  • Faster R-CNN
  • R-FCN
  • Multi-Scale R-CNN
  • Feature Pyramid Network
April 6, 2018 Topic: Pixel Computing
  • Pixel Labeling: Segmentation, Matting, Parsing
  • Unsupervised Image Segmentation: K-means, Mean-Shift, Normalized Cut
  • Interactive Object Cutout: GraphCut, GrabCut, LazySnapping
  • Image Matting: Poisson Matting, Closed-Form Matting, Robust Color Sampling
  • Image Co-segmentation
  • Image Inpainting / Image Completion
  • Scene Parsing: Sparse Coding
April 13, 2018 Topic: Pixel Computing
  • Deep Pixel Labeling: FCN, DeepLab, SegNet, CNN-as-RNN
  • Human Pose Estimation: Bottom-Up and Top-Down
  • Assignment 2: Image Segmentation (deadline: 5/13 midnight)
  • Note on Survey Paper (deadline: TBD)
  • Final Course Project: Send Team Information and Topic (deadline: 4/27 midnight)
April 27, 2018 Topic: Video Computing
  • Introduction of Video Computing Tasks
  • Video Features (STIP, Deep Video, C3D, Trajectory Feature)
  • Deep Learning for Video Classification (multi-stream fusion techniques)
  • Video Event Detection and Action Detection
  • An Illustrative System for Video Classification
May 11, 2018 Topic: Reccurent Deep Networks
  • Unrolling Computational Graph
  • RNN variants (recurrent through output, sequence-input-single-output, teaching forcing, encoder-decoder, bi/quad-directional RNN etc.)
  • Generative RNN modeling
  • Back propagation through time (BPTT)
  • The issue and remedy for long-term dependency in RNN
  • Long short-term memory (LSTM)
  • Applications (image captioning, convLSTM for rainfall prediction, social LSTM)
May 18, 2018 Topic: Autonomous Driving
  • Past, Present, and Prospect
  • Funding, Challenges, and Benchmarks
  • DeepLanes
  • CCF-UISEE Traffic Sign Detection
  • Autonomous Steering
  • Deep Driving
  • Driver Maneuver Prediction
  • Localization via Visual Odometry
May 25, 2018 Topic: Invited Talks
  • Wang Jindong (MSRA), Interleaved Group Convolutions for Efficient Deep Neural Networks.
  • Dai Jifeng (MSRA), Spatial Transformers and Deformable CNNs.
June 1 2018 Topic: Subspace Learning and Image Hashing
  • Dimension Reducation: PCA, CCA, Fisher LDA, EigenFace, LDA-Face
  • Nonlinear Methods: MDS, ISOMAP, LLE
  • LPP, graph embedding
  • Locality-Sensitive Hashing: the concept and proof of sublinear complexity in the STOC98 paper
  • LSH schemes for cosine similarity, Jaccard index (minHash)
  • Applications of LSH in computer vision
June 8, 2018 Course Project Presentation Schedule [pdf]
June 15, 2018 Course Project Presentation Schedule [pdf]