This thesis studies the problems of feature tracking and motion estimation and
presents an application of these concepts to human-computer interaction. The
presentation is divided into three parts.
The first part addresses feature tracking in a multi-scale context. Features
in an image appear at different scales, and these scales can be expected to
change over time due to the size variations that occur when objects move relative
to the camera. A scheme for feature tracking is presented, which incorporates
a mechanism for automatic scale selection and it is argued that
such a mechanism is necessary to handle size variations over time. Experiments
demonstrate how the proposed scheme is robust to size variations in
situations where a traditional fixed scale tracker fails. This leads to extended
feature trajectories, which are valuable for motion and structure estimation.
It is also shown how an object representation suitable for tracking can be
built in a conceptually simple way as a multi-scale feature hierarchy with
qualitative relations between features at different scales. Experiments illustrate
the capability of the proposed hierarchy to handle occlusions and semirigid
The second part of the thesis develops a geometric framework for computing
estimates of 3D structure and motion from sparse feature correspondences
in monocular sequences. A tool is presented, called the centered affine trifocal
tensor, for motion estimation from three affine views. Moreover, a factorization
approach is developed which simultaneously handles point and line
correspondences in multiple affine views. Experiments show the influence of
several factors on the accuracy of the structure and motion estimates, including
noise in the feature localization, perspective effects and the number of feature
correspondences. This motion estimation framework is also applied to
feature correspondences obtained from the abovementioned feature tracker.
The last part integrates the functionalities from the first two parts into a
pre-prototype system which explores new principles for human-computer interaction.
The idea is to transfer 3D orientation to a computer using no other
equipment than the operator’s hand.
Stockholm: KTH , 1999. , v, 153 p.