Understanding and interpreting dynamic human actions is an important area of research in the field of computer vision and robotics. In robotics, it is closely related to task programming. Traditionally, robot task programming has required an experienced programmer and tedious work. By contrast, Programming by Demonstration is an intuitive method that allows to program a robot in a very flexible way. The programmer demonstrates or shows how a particular task is performed and the robot learns in an efficient and natural manner how to imitate or reproduce the human actions. Here, we develop a general policy for learning the relevant features of a demonstrated activity and we restrict our study to imitation of object manipulation activities. A Nest of Birds magnetic tracker is used for activity recognition and two different dimensionality reduction techniques are applied.
The first one uses linear dimensionality reduction in order to find the underlying structure of the data. Particularly, Principal Component Analysis (PCA) is used to learn a set of principal components (PCs) to characterize the data. The main problem using PCA is that linear PCs cannot represent the non-linear nature of human motion. The second method uses a non-linear dimensionality reduction technique. Specifically, spatio-temporal Isomap is applied to uncover the intrinsic non-linear geometry of the data, and it is captured through computing the geodesic manifold distances between all pairs of data points.
For classification purposes, both PCA and ST-Isomap can be viewed as a preprocessing step. When the dimensionality of the input data is so high that becomes intractable, most classification methods will suffer and even fail in their goals due to their sensitivity to the input data dimensionality. Fortunately, high dimensional data often represent phenomena that are intrinsically low dimensional. Thus, the problem of high dimensional data classification can be solved by first mapping the original data into a lower dimensional space using a dimensionality reduction method such as PCA or ST-Isomap and then applying K-nearest neighbors (K-NN), radial basis functions (RBF) or any other classification method to classify of the query sequence.
In the first stage of our work, PCA combined with k-means clustering is applied.
In the second stage of our work, spatio temporal Isomap (ST-Isomap) combined with Shepard’s interpolation is applied.
For classification purposes, simple Euclidean distances are used.
The experimental evaluation shows that a linear dimensionality reduction technique can not find the intrinsic structure of human motions due to their non-linear nature. In contrary, a non-linear one, such as spatio-temporal Isomap is able to uncover a low dimensional space in which the data lies facilitating the classification step in a much better way than PCA.
2006. , 163 p.