Chen-Hsuan Lin

Hsuan is pronounced like "shoo-en" with a quick transition.
The first name is Chen-Hsuan (neither just Chen nor Hsuan).

I am a 4th-year Ph.D. student in the Robotics Institute at Carnegie Mellon University, working with Prof. Simon Lucey on computer vision & artificial intelligence research. Before that, I received my M.S. in Robotics from CMU and B.S. in Electrical Engineering from National Taiwan University.

I am interested in solving 3D reconstruction problems using self-supervised learning techniques. My long-term research goal is to automate geometric factorization and reconstruction of the 3D world from image/video data, in turn to improve learning efficiency in visual recognition systems. My research has been supported by the NVIDIA Graduate Fellowship.

I have been fortunate to collaborate during internships with Kaiming He, Georgia Gkioxari, and Justin Johnson at Facebook AI Research, as well as Oliver Wang, Eli Shechtman, Bryan Russell, Vladimir Kim, Matthew Fisher, and Ersin Yumer at Adobe Research.

Contact email: chlin (at) cmu (dot) edu

Updates

10/2020 The code and short talk of our NeurIPS 2020 paper are online! Check out the project page.
09/2020 I have two papers accepted to NeurIPS 2020 and 3DV 2020!
04/2019 I will be joining Facebook AI Research (FAIR) for an internship this summer.
02/2019 I have one paper accepted to CVPR 2019!
12/2018 I received the NVIDIA Graduate Fellowship (10 selected worldwide). Thanks NVIDIA!
04/2018 I will be joining Adobe Research for a second internship this summer.
older updates... (show)

Research

SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Chen-Hsuan Lin, Chaoyang Wang, and Simon Lucey
Advances in Neural Information Processing Systems (NeurIPS), 2020
paper arXiv preprint project page code BibTex (show)
@inproceedings{lin2020sdfsrn,
  title={SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images},
  author={Lin, Chen-Hsuan and Wang, Chaoyang and Lucey, Simon},
  booktitle={Advances in Neural Information Processing Systems ({NeurIPS})},
  year={2020}
}
3D shape reconstruction as signed distance functions (SDFs) trained from static images. We establish the geometric connection of 3D SDFs with distance-transformed silhouettes for learning from single images without 3D supervision.

Deep NRSfM++: Towards Unsupervised 2D-3D Lifting in the Wild

Chaoyang Wang, Chen-Hsuan Lin, and Simon Lucey
IEEE International Conference on 3D Vision (3DV), 2020 (oral presentation)
paper (coming soon!) arXiv preprint BibTex (show)
@inproceedings{wang2020deep,
  title={Deep NRSfM++: Towards Unsupervised 2D-3D Lifting in the Wild},
  author={Wang, Chaoyang and Lin, Chen-Hsuan and Lucey, Simon},
  booktitle={IEEE International Conference on 3D Vision ({3DV})},
  year={2020}
}
A self-supervised framework to learn 3D structure recovery from 2D landmarks, closely related to hierarchical block-sparse coding in NRSfM. Our method handles both weak and strong perspective cameras and missing data.

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction

Chen-Hsuan Lin, Oliver Wang, Bryan C. Russell, Eli Shechtman, Vladimir G. Kim, Matthew Fisher, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
paper arXiv preprint project page code BibTex (show)
@inproceedings{lin2019photometric,
  title={Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction},
  author={Lin, Chen-Hsuan and Wang, Oliver and Russell, Bryan C and Shechtman, Eli and Kim, Vladimir G and Fisher, Matthew and Lucey, Simon},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2019}
}
A 3D mesh reconstruction method from RGB videos using multi-view geometry with learned shape priors. We pose the problem as piece-wise 2D image alignment, enabling meshes to align against 2D videos without depth or masks.

ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing

Chen-Hsuan Lin, Ersin Yumer, Oliver Wang, Eli Shechtman, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
paper arXiv preprint project page code BibTex (show)
@inproceedings{lin2018stgan,
  title={ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing},
  author={Lin, Chen-Hsuan and Yumer, Ersin and Wang, Oliver and Shechtman, Eli and Lucey, Simon},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2018}
}
A GAN architecture that learns to correct the geometry of objects, creating realistic image composites. Our method predicts geometric corrections driven solely by appearance realism, where ground-truth supervision is unavailable.

Deep-LK for Efficient Adaptive Object Tracking

Chaoyang Wang, Hamed Kiani Galoogahi, Chen-Hsuan Lin, and Simon Lucey
IEEE International Conference on Robotics and Automation (ICRA), 2018
paper arXiv preprint BibTex (show)
@inproceedings{wang2018deeplk,
  title={Deep-LK for Efficient Adaptive Object Tracking},
  author={Wang, Chaoyang and Galoogahi, Hamed Kiani and Lin, Chen-Hsuan and Lucey, Simon},
  booktitle={IEEE International Conference on Robotics and Automation ({ICRA})},
  year={2018}
}
A Siamese regression network with alignment-based structured optimization for object tracking. We learn the feature representation that adapts to the regression parameters online with respect to the tracked template images.

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

Chen-Hsuan Lin, Chen Kong, and Simon Lucey
AAAI Conference on Artificial Intelligence (AAAI), 2018 (oral presentation)
paper arXiv preprint project page code BibTex (show)
@inproceedings{lin2018learning,
  title={Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction},
  author={Lin, Chen-Hsuan and Kong, Chen and Lucey, Simon},
  booktitle={AAAI Conference on Artificial Intelligence ({AAAI})},
  year={2018}
}
A simple 2D convolutional network to efficiently reconstruct 3D point cloud shapes. We design a novel differentiable point cloud renderer to approximate the rasterization of dense point clouds to depth images for training.

Object-Centric Photometric Bundle Adjustment with Deep Shape Prior

Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, and Simon Lucey
IEEE Winter Conference on Applications of Computer Vision (WACV), 2018
paper arXiv preprint extension paper BibTex (show)
@inproceedings{zhu2017object,
  title={Object-Centric Photometric Bundle Adjustment with Deep Shape Prior},
  author={Zhu, Rui and Wang, Chaoyang and Lin, Chen-Hsuan and Wang, Ziyan and Lucey, Simon},
  booktitle={IEEE Winter Conference on Applications of Computer Vision ({WACV})},
  year={2018}
}
Neural networks can be utilized as 3D shape priors for photometric bundle adjustment. We demonstrate the use of 3D point cloud generators to align shapes to video sequences within the optimization-based inference framework.

Inverse Compositional Spatial Transformer Networks

Chen-Hsuan Lin and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (oral presentation)
paper arXiv preprint project page presentation code BibTex (show)
@inproceedings{lin2017inverse,
  title={Inverse Compositional Spatial Transformer Networks},
  author={Lin, Chen-Hsuan and Lucey, Simon},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2017}
}
A significant improvement over Spatial Transformer Networks inspired by the Lucas-Kanade algorithm. Our method learns recurrent spatial transformations to resolve for large geometric redundancies for efficient visual recognition.

Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image

Chen Kong, Chen-Hsuan Lin, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
paper BibTex (show)
@inproceedings{kong2017using,
  title={Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image},
  author={Kong, Chen and Lin, Chen-Hsuan and Lucey, Simon},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2017}
}
A 3D shape reconstruction method by retrieving the nearest-neighbor CAD model whose projection best matches the given image, where local landmark correspondences allow for sparse linear combinations of a CAD model dictionary.

The Conditional Lucas & Kanade Algorithm

Chen-Hsuan Lin, Rui Zhu, and Simon Lucey
European Conference on Computer Vision (ECCV), 2016
paper arXiv preprint project page code BibTex (show)
@inproceedings{lin2016conditional,
  title={The Conditional Lucas \& Kanade Algorithm},
  author={Lin, Chen-Hsuan and Zhu, Rui and Lucey, Simon},
  booktitle={European Conference on Computer Vision (ECCV)},
  pages={793--808},
  year={2016},
  organization={Springer International Publishing}
}
An image alignment algorithm inspired by the seminal Lucas-Kanade algorithm, internally learned with a conditional objective function. Our learning-based method significantly improves over the classical synthesis-based objective.

Teaching

Visual Learning and Recognition (CMU 16-824), Spring 2019
Teaching Assistant / Graduate Student Instructor (with Prof. Abhinav Gupta)
(Lectures: 3D Vision & 3D Reasoning and Semantic Segmentation & Pixel Labeling)
Computer Vision (CMU 16-720 A/B), Fall 2017
Head Teaching Assistant (with Prof. Srinivasa Narasimhan, Prof. Simon Lucey, Prof. Yaser Sheikh)
Designing Computer Vision Apps (CMU 16-423), Fall 2015
Teaching Assistant (with Prof. Simon Lucey)

Academic Projects

Towards a More Curious Agent

CMU 10-703 Deep Reinforcement Learning & Control
paper
We model the intrinsic reward of agents with the causal distribution of visual observations for policy networks, solving navigation problems with very sparse rewards.

3D Facial Model Fitting from 2D Videos

CMU CI2CV Lab
video (show)
A dense 3D reconstruction system of metric-scale faces from self-captured 2D videos by solving for the sparse 3D facial landmarks followed by dense mesh fitting.

Perceptual Rate-Distortion Optimization of Motion Estimation

NTU Multimedia Processing & Communications Lab
paper
An optimization framework in video compression to use perceptual metrics to find an optimal tradeoff between encoding bitrate and decoding image distortion.

Disentangler Networks with Absolute and Relative Attributes

CMU 16-824 Visual Learning & Recognition
paper
A generative network to disentangle image embeddings into controllable attributes for image manipulation. Our network learns from relative rankings of attributes.

Video Summarization via Convolutional Neural Networks

CMU 10-701 Machine Learning
report
We use a novel loss function to learn image features for video summarization, achieved via K-means clustering of input video frames in the latent space at test time.

Virtual Piano Keyboard System

NTU Digital Circuit Design Lab
presentation demo
A sensor-based virtual instrumental system using only a paper keyboard, using real-time fingertip detection and keyboard recognition on raw images from CCD sensors.



© designed by Chen-Hsuan Lin.