Understanding and interpreting dynamic scenes and activities is a very challenging problem. In this paper we present a system capable of learning robot tasks from demonstration. Classical robot task programming requires an experienced programmer and a lot of tedious work. In contrast, Programming by Demonstration is a flexible framework that reduces the complexity of programming robot tasks, and allows end-users to demonstrate the tasks instead of writing code. We present our recent steps towards this goal. A system for learning pick-and-place tasks by manually demonstrating them is presented. Each demonstrated task is described by an abstract model involving a set of simple tasks such as what object is moved, where it is moved, and which grasp type was used to move it.