Chen-Hsuan Lin

The first name is Chen-Hsuan (neither just Chen nor Hsuan).
Hsuan is pronounced like "shoo-en" with a quick transition.
I am a research scientist at NVIDIA Research, working on computer vision, computer graphics, and artificial intelligence. I received my Ph.D. in Robotics from Carnegie Mellon University, where I was advised by Simon Lucey and supported by the NVIDIA Graduate Fellowship. During my Ph.D. years, I have also spent internships at Facebook AI Research and Adobe Research. Before that, I received my M.S. in Robotics from CMU and B.S. in Electrical Engineering from National Taiwan University.
I am interested in solving 3D reconstruction and view synthesis problems using neural rendering and self-supervised learning techniques. My research goal is to empower AI systems with dense 3D perception and imagination abilities by learning from visual data in the wild, in order to advance towards the next level of visual and 3D spatial artificial intelligence.
Email:  chenhsuanl (at) nvidia (dot) com


older updates... (show)


BARF: Bundle-Adjusting Neural Radiance Fields

Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey
IEEE International Conference on Computer Vision (ICCV), 2021 (oral presentation)
Neural Radiance Fields can be trained from unknown camera poses! Inspired by classical image alignment, we show that coarse-to-fine optimization is simple yet effective for joint registration and reconstruction on 3D scene representations.

SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

Chen-Hsuan Lin, Chaoyang Wang, and Simon Lucey
Advances in Neural Information Processing Systems (NeurIPS), 2020
Implicit 3D shape reconstruction can be trained from static image collections without multi-view supervision! We establish the geometric connection of 2D silhouettes to 3D SDF shapes for scalable single-view training on real-world image data.

Deep NRSfM++: Towards Unsupervised 2D-3D Lifting in the Wild

Chaoyang Wang, Chen-Hsuan Lin, and Simon Lucey
IEEE International Conference on 3D Vision (3DV), 2020 (oral presentation)
A self-supervised framework for 3D structure and pose recovery from 2D landmarks, closely related to hierarchical block-sparse coding in non-rigid structure from motion. Our method can handle perspective camera models and missing data.

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction

Chen-Hsuan Lin, Oliver Wang, Bryan C. Russell, Eli Shechtman, Vladimir G. Kim, Matthew Fisher, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
3D mesh reconstruction from RGB videos using photometric optimization with learned shape priors. This allows 3D object meshes to deform in a learned shape space while being pixel-aligned against RGB videos without depth or silhouettes.

ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing

Chen-Hsuan Lin, Ersin Yumer, Oliver Wang, Eli Shechtman, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
GANs can learn to correct the geometry of objects and create realistic image composites! Our method discovers plausible geometric configurations of objects driven solely by appearance realism, where ground-truth supervision are unavailable.

Deep-LK for Efficient Adaptive Object Tracking

Chaoyang Wang, Hamed Kiani Galoogahi, Chen-Hsuan Lin, and Simon Lucey
IEEE International Conference on Robotics and Automation (ICRA), 2018
We can use Siamese neural networks to learn general object tracking by optimizing a alignment-based objective function. The learned feature representations adapts to the regression parameters online with respect to the tracked templates.

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

Chen-Hsuan Lin, Chen Kong, and Simon Lucey
AAAI Conference on Artificial Intelligence (AAAI), 2018 (oral presentation)
We design a novel differentiable point cloud renderer to approximate the rasterization of dense 3D point clouds generated by a 2D convolutional neural network, so that the generated point clouds can be supervised from training depth images.

Object-Centric Photometric Bundle Adjustment with Deep Shape Prior

Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, and Simon Lucey
IEEE Winter Conference on Applications of Computer Vision (WACV), 2018
3D shape prediction networks can be utilized as a strong semantic prior for object-centric photometric bundle adjustment. We use pretrained 3D point cloud generators to align shapes to videos within an optimization-based inference framework.

Inverse Compositional Spatial Transformer Networks

Chen-Hsuan Lin and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (oral presentation)
A redesign of Spatial Transformer Networks inspired by the Lucas-Kanade algorithm. With the same network architecture, our method learns recurrent spatial transformations to resolve geometric redundancies for efficient visual recognition.

Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image

Chen Kong, Chen-Hsuan Lin, and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
A 3D shape reconstruction method based on CAD model retrieval. By matching keypoint projections from the input image, we can use local landmark correspondences to solve for sparse linear combinations of a prebuilt CAD model dictionary.

The Conditional Lucas & Kanade Algorithm

Chen-Hsuan Lin, Rui Zhu, and Simon Lucey
European Conference on Computer Vision (ECCV), 2016
A learning-based image registration method inspired by the seminal Lucas-Kanade algorithm. With structured optimization and a conditional loss, our method significantly improves over classical synthesis-based optimization objective functions.

Ph.D. Dissertation

Learning 3D Registration and Reconstruction from the Visual World

Chen-Hsuan Lin
Carnegie Mellon University, 2021


NVIDIA Research, 2021 – present
Research Scientist
Research in dense 3D reconstruction, self-supervised learning, and neural rendering.
Carnegie Mellon University, 2014 – 2021
Graduate Research Assistant (with Simon Lucey)
Research in geometric image registration, dense 3D reconstruction, and self-supervised learning.
Facebook AI Research, 2019
Research Intern (with Kaiming He, Georgia Gkioxari, and Justin Johnson)
Learning 3D-aware feature representations for improving standard 2D object detection systems.
Adobe Research, 2018
Research Intern (with Oliver Wang, Bryan Russell, Eli Shechtman, Vladimir Kim, and Matthew Fisher)
Photometric optimization of 3D object meshes for shape reconstruction aligned to RGB videos.
Adobe Research, 2017
Research Intern (with Eli Shechtman, Oliver Wang, and Ersin Yumer)
Learning geometric corrections of composited objects in images driven by appearance realism.
National Taiwan University, 2011 – 2013
Undergraduate Research Assistant (with Homer H. Chen)
Designing rate-distortion optimization for video compression based on perceptual quality metrics.


Visual Learning and Recognition (CMU 16-824), Spring 2019
Teaching Assistant / Graduate Student Instructor (with Abhinav Gupta)
(Lectures: 3D Vision & 3D Reasoning, Semantic Segmentation & Pixel Labeling)
Computer Vision (CMU 16-720 A/B), Fall 2017
Head Teaching Assistant (with Srinivasa Narasimhan, Simon Lucey, and Yaser Sheikh)
Designing Computer Vision Apps (CMU 16-423), Fall 2015
Teaching Assistant (with Simon Lucey)

Academic Projects

Towards a More Curious Agent

CMU 10-703 Deep Reinforcement Learning & Control (2018)
We model intrinsic rewards of agents with the causal distribution of visual observations for policy networks, solving navigation problems with very sparse rewards.

Deep 3D Photoshop

CMU 16-824 Visual Learning & Recognition (2016)
We design a photo-editing system that can generate a target object with different 3D poses and styles, while the background is inpainted by training a GAN.

3D Facial Model Fitting from 2D Videos

CMU CI2CV Computer Vision Lab (2015)
video (show)
A 3D reconstruction system of metric-scale faces from self-captured 2D videos by solving for sparse 3D facial landmarks followed by dense 3D mesh fitting.

Video Summarization via Convolutional Neural Networks

CMU 10-701 Machine Learning (2015)
We design a new objective for end-to-end learning of video summarization, which allows K-means clustering of input video frames in the latent space at test time.

Perceptual Rate-Distortion Optimization of Motion Estimation

NTU Multimedia Processing & Communications Lab (2013)
A simple mechanism for selecting the optimal tradeoff factor between the encoding bitrate and the decoding distortion for video coding using perceptual metrics.

Virtual Piano Keyboard System

NTU Digital Circuit Design Lab (2012)
presentation demo
A sensor-based virtual instrumental system with only a paper keyboard using real-time fingertip and keyboard pattern recognition on raw CCD sensory input data.

© designed by Chen-Hsuan Lin.