Advanced Topics in Computer Vision

(2018 Graduate Course)

Instructor: Prof. Mu Yadong (email:
Location: Room 406, Building 3 (406)
Time: Tuesday 10:10am - 12:00pm (biweekly), Thursday 10:10am - 12:00pm (weekly)
TA: Chi Lu (, Liu Chenchen (, Weng Xinyu (
Office Hour: TBD
March 1, 2018 Introduction
  • Course Introduction
  • Basics of Deep Neural Networks
March 6, 2018 Topic: Visual Recognition
  • Visual Recognition: Task Definition and Challenges
  • Bag-of-words Models
  • Spatial Pyramid Matching & Pyramid Match Kernel
March 8, 2018 Topic: Visual Recognition
  • Vocabulary Tree, Sparse Coding
  • Deep Learning for Visual Recognition: LeNet-5, AlexNet, VGG-16, GoogleNet, ResNet, DenseNet, DualPathNet
March 15, 2018 No Class  
March 20, 2018 Topic: Object Detection
  • V-J Face Detector (Integral Image, AdaBoost, Cascade)
  • HOG+SVM with NMS
  • Deformable Part Model (DPM) for Pedestrian Detection
March 22, 2018 Topic: Object Detection
  • R-CNN
  • Fast R-CNN
  • Faster R-CNN
  • R-FCN
  • Multi-Scale R-CNN
  • Feature Pyramid Network
March 29, 2018 Topic: Pixel Labeling
  • Pixel Labeling: Segmentation, Matting, Parsing
  • Unsupervised Image Segmentation: K-means, Mean-Shift, Normalized Cut
  • Interactive Object Cutout: GraphCut, GrabCut, LazySnapping
  • Image Matting: Poisson Matting, Closed-Form Matting, Robust Color Sampling
  • Image Co-segmentation
  • Image Inpainting / Image Completion
  • Scene Parsing: Sparse Coding
April 3, 2018 Topic: Pixel Labeling
  • Deep Pixel Labeling: FCN, DeepLab, SegNet, CNN-as-RNN
  • Human Pose Estimation: Bottom-Up and Top-Down
April 5, 2018 No Class  
April 12, 2018 Topic: Large-Scale Image Search
  • Dimension Reducation: PCA, CCA, Fisher LDA
  • Nonlinear Methods: MDS, ISOMAP, LLE
  • LPP, graph embedding
  • Johnson Lindenstrauss lemma
  • The magic of hashing collision: Bloom Filter
April 17, 2018 Topic: Large-Scale Image Search
  • Locality-Sensitive Hashing: the concept and proof of sublinear complexity in the STOC98 paper
  • LSH schemes for Hamming space, cosine similarity, Jaccard index (minHash)
  • Spectral Hashing
  • ITQ
  • Semi-Supervsied Hashing
  • Supervised Hashing with Kernel
  • Deep Hashing
  • Discrete Hashing
  • Hashing for Large-Scale Machine Learning
April 19, 2018 Paper Presentation
  • Li Xiangtai, Towards High Performance Video Object Detection, CVPR 2018.
  • Xiong Yifan, Appearance-and-Relation Networks for Video Classification, CVPR 2018.
  • Li Sheng, Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition, AAAI 2018.
  • Zhang Weidong, Channel Pruning For Accelerating Very Deep Neural Networks, CVPR 2017.
  • Liu Chenchen, Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.
April 26, 2018 Paper Presentation
  • Yuan Bin, You Only Look Once: Unified, Real-Time Object Detection
  • Zheng Yajing, Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition, ICCV 2017.
  • Weng Xinyu, Leveraging Unlabeled Data for Crowd Counting by Learning to Rank, CVPR 2018.
  • Wu Jingyi, PixelGAN Autoencoders, NIPS 2017.
  • Wang Ce, Self-Supervised Intrinsic Image Decomposition, NIPS 2017.
May 1, 2018 No Class  
May 3, 2018 No Class  
May 10, 2018 Paper Presentation
  • Xue Lantian, Harmonious Attention Network for Person Re-Identification, CVPR 2018.
  • Xing Yajie, Context Encoding for Semantic Segmentation, CVPR 2018.
  • Chen Ziqian, Local Binary Convolutional Neural Network, CVPR 2017.
  • Chi Lu, Rethinking Feature Distribution for Loss Functions in Image Classification
  • Li Zongxian, Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification, CVPR 2018.
May 15, 2018 Topic: Video Computing
  • Introduction of Video Computing Tasks
  • Video Features (STIP, Deep Video, C3D, Trajectory Feature)
  • Deep Learning for Video Classification (multi-stream fusion techniques)
  • An Illustrative System for Video Classification
  • TRECVID Tasks (MED and Instance Search)
May 17, 2018 Topic: Reccurent Deep Networks
  • Unrolling Computational Graph
  • RNN variants (recurrent through output, sequence-input-single-output, teaching forcing, encoder-decoder, bi/quad-directional RNN etc.)
  • Generative RNN modeling
  • Back propagation through time (BPTT)
  • The issue and remedy for long-term dependency in RNN
  • Long short-term memory (LSTM)
  • Applications (image captioning, convLSTM for rainfall prediction, social LSTM)
May 25, 2018 Invited Speech
  • Two speakers from MSRA
May 29, 2018 No Class


May 31, 2018 Topic: Autonomous Vehicle
  • Past, Present, and Prospect
  • Funding, Challenges, and Benchmarks
  • DeepLanes
  • CCF-UISEE Traffic Sign Detection
  • End-to-End Driving Model
  • Deep Driving
  • Localization by Visual Odometry
  • HMM based Driver Maneuver Prediction
June 7, 2018 No Class  
June 14, 2018 Course Project Presentation
  • Wu Jingyi, Cell Segmentation and Lineage Tracing
  • Xiong Yifan, Action Prediction.
  • Weng Xinyu, Video Steganography.
  • Yuan Bin, Object Detection.
  • Chi Lu, Large-Scale Video Classification.
  • Zheng Yajing, Fine-Grained Image Recognition on CUB-200-2011.
June 21, 2018 Course Project Presentation
  • Zhang Weidong, Li Xiangtai, Semantic Image Segmentation.
  • Wang Ce, Chen Ziqian, Deep Network Compression.
  • Xue Lantian, Li Zongxian, Li Sheng, Content-based Video Relevance Prediction Challenge(ACM Multimedia Workshop Grand Challenge).
  • Xing Yajie, Adaptive Global Context Inferring for Semantic Segmentation.
  • Liu Chenchen, Fashion AI Competition.