Object Tracking

Fig. 1: Example iterations that show how a point cloud aligns with its counterpart in the environment.


Our research focuses on robustness of ICP for object tracking in dynamic, large, cluttered environments as well as multi-object tracking.


Our research focuses on tracking of individual objects in a large point cloud, which requires robustness improvements to deal with cluttered, large scenes. The algorithm was initially introduced by Besl et al.1) and Chen2) to align scanned objects with 3D models. We changed the use case, adapted the algorithm and added new parts to be able to individual objects. In our ICP-solution, a reference model literally snaps on the object of interest in a point cloud and continuously follows this object from camera frame to camera frame. Environment point clouds are obtained from RGB-D cameras such as the Kinect and the Structure Sensor. The object of interest, its point cloud X in particular, can be derived from a CAD model.


In detail, the ICP algorithms is an iterative algorithm, with iterations are implemented as a closed loop. One iteration includes the three steps: 1) finding point-pairs between X and M, 2) calculate the transformation [R|t], and 3) determine the error using a last-square formula. The error value is minimized within several iterations yielding [R|t]. Termination criteria for the closed loop are a minimum error value, the number of iterations, or the improvement between successive steps.



Object tracking with ICP. The piston motor is tracked while the user assembles parts of the motor.


False point-pairs are major challenges for the ICP algorithm. A typical solution is to reject false point-pairs as outliers. One can verify the compatibility of color information, distance, normal vector alignment, etc. to do so. Several approaches have been introduced with defined performance and robustness improvements. The results can be used in many use cases. However, part of our research focuses on object tracking for assembly assistance, maintenance, etc., situations, in which the operator - as part of his or her work - continuously covers the object to track (Figure 2). Hands and tools, which are visible in the image, cause mismatches, which cannot always be successfully rejected. Unfortunately, the number of those points is high so that it is almost not possible to call them outliers anymore. They cause false point associations, and as a consequence, they reduce the accuracy we can achieve in an AR application, noticeable by misaligned virtual objects.


Fig. 2: Object tracking for assembly assistance. The piston motor is tracked.


In summary, the robust of ICP relies a) on correct point-pair associations and b) on the stability of the point selection. False point-pairs result in a false transformation, too many false transformation is successive camera frames literally move the registered object away, which is considered as a lose of tracking. Unstable point clouds cause a similar effect. Unstable means here, a large number of points suddenly appear or disappear, which may have an impact on the transformation's delta. Hands of the user, which move quickly towards an object, cause this effect.


We work on point-pair rejection mechanisms which dynamically adapt a distance threshold with respect to the stability of the point cloud set. The stability can be represented as an expected point cloud shape, which can be further reduced to an expected distance per point. This results in a linear equation which parameters have been optimized using the shape of the object to track (as expected outcome). The results indicate an improvement; we can better track the object even if a user covers major parts of the object to track. Misalignments are still noticeable, however, we could significantly improve the tracking accuracy for an AR application.


To transfer our tracking approach from a camera-field-of-view tracking area to large environments is the second research focus. Large scenes are room-sized scene, for instance, in which we are able to find objects (Figure 3). Although major parts of those scene may not be visible in a camera frame at a given time, they are required to maintain tracking of an object. For instance, a user can move the camera focus away from an object of interest - we would lose tracking. In the next second, the user swings the camera back. At this point, our tracking application has to re-initialize tracking. Maintaining tracking even if a part is out of the field of view helps to speed up this process. Large scene tracking is also required if an object does not fit into a camera frame, an airplane for instance. Once we obtained a major fraction of the object and represent it as a point cloud, we are literarily able to navigate on the surface of this model and can maintain a stable camera position.

Fig. 3: We work with large point clouds scenes with the goal to identify objects in the environment.


The ICP algorithm cannot work with a point cloud such as shown in Fig. 3 since ICP requires an initial model overlap; the object to track and the counterpart in the environment model need to be close to each other.

A method to establish this initial overlap is 3D feature matching. A 3D feature descriptor describes the vicinity as a set of characteristic values such as angle between points and the distance. One may call them a fingerprint and instead of points, we initially match feature descriptors to establish an initial overlap. However, feature descriptors require more computational performance than ICP. The dataset to describe a feature is larger than the data set to store a point. Thus, the matching process requires more time. For that reason, once we established an initial start position and X and M are in vicinity to each other, we switch to point-pairs and use ICP to further follow the object.

Currently, we are able to identify certain objects in room size models. We work on a better description of objects with feature descriptor and on real-time performance.



1) P. Besl and N. McKay. A method for registration of 3d-shapes. Transactions on Pattern Analysis and Machine Intelligence, 18(8):239 – 256, Feb. 1992.

2) Y. Chen and G. Medioni. Object modeling by registration of multiple range images. In Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on, pages 2724–2729 vol.3, Apr 1991.



About Us


Recent Works

The Augmented Reality Lab explores the augmented reality (AR) technology and its capabilities for engineering applications.


+1 (515) 294-7044

Iowa State University

1620 Howe Hall

Ames, IA 50011-2161



© 2014 All Rights Reserved