Features in Extra Dimensions: Spatial and Temporal Scene Representations

August 2022

Features in Extra Dimensions: Spatial and Temporal Scene Representations

Authors:

Zhaoyuan Fang

Abstract:

Computer vision models have made great progress in featurizing pixels of images. However, an image is only a projection of the actual 3D scene: occlusions and perspective distortions exist. To arrive at a better representation of the scene itself, extra dimensions are needed to learn spatial or temporal priors.

In this thesis, we propose two methods that introduce extra dimensions for modelling the scene space and time. The first method lifts features from the image plane onto the bird's eye view (BEV) plane for perception in autonomous driving. Features over the scene space enables our models to handle occlusion better, producing accurate BEV semantic representation. The second method introduces extra dimensions for modelling time, for better geometry-free point tracking. We track points through partial or full occlusions, using components that drive the current state-of-the-art in flow and object tracking, such as learned temporal priors, iterative optimization, and appearance updates. Features allocated over timesteps enables our models to track over long horizons and through occlusions, outperforming previous feature-matching and optical flow methods.

Notes:

@mastersthesis{Fang-2022-133140,
author = {Zhaoyuan Fang},
title = {Features in Extra Dimensions: Spatial and Temporal Scene Representations},
year = {2022},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-22-39},
keywords = {3D Vision, BEV Perception, Tracking},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.