In robotics, recognition of human activity has been used extensively for robot task learning through imitation and demonstration. However, there has not been much work performed on modeling and recognition of activities that involve object manipulation and grasping. In this work, we deal with single arm/hand actions which are very similar to each other in terms of arm/hand motions. The approach is based on the hypothesis that actions can be represented as sequences of motion primitives. Given this, a set of five different manipulation actions of different levels of complexity are investigated. To model the process, we use a combination of discriminative support vector machines and generative hidden Markov models. The experimental evaluation, performed with 10 people, investigates both the definition and structure of primitive motions, as well as the validity of the modeling approach taken.