last update 04/12/2001

2D/3D Real Time Object Tracking

Many imaging projects are based on the tracking of objects or markers on objects in 2- or 3- dimensional space at high speed. The picCOLOR Real Time Position Tracking extension module was developed to meet these requirements. The functions of this extension module allow the definition of markers and the set up of camera locations and transformation equations for reconstruction of 3-dimensional coordinates from two camera views.
Normally the speed of the analysis is a very important question. The functions of the extension module are optimized to run at highest possible speed for real-time applications. Depending on the type of video camera high tracking frequencies can be realised. With a normal CCIR camera running at 25 Hz, a 3-dimensional tracking frequency of 12.5 Hz is possible with the actual software version, i.e. every other image can be analysed. For higher speed requirements special cameras can be used at tracking speed of up to 40 Hz for 3- dimensional tracking or even 90 Hz for 2-dimensional tracking. In a future updated version of the program even full video frequency can be used, i. e. up to 25 Hz with regular cameras or higher with special cameras. Of course the functions of the extension module can also be used for post-processing of already loaded or recorded video sequences or single frames.

A few words on "real time": "real time" is a widely used - or misused - term in modern high speed computing. What does it mean? Does it mean to finish a certain calculation extremely fast, or analyse extremely quick events? - Not at all: "real time" just means to analyse an event at exactly the time as it is taking place, may it be slow or fast. This enables to react on a special action or to control a process. Therefore, the first thing to do is to define or find out how fast a process is going to happen and how fast a reaction must be to be able to perform any control function. An often used definition is "video real time". Regular video frequency is 25Hz for European CCIR video standard. If a process can not be dissolved at this frequency, like for instance a high frequency aircraft wing model flutter problem, a special high speed camera has to be used. On the other hand there are many processes that are a lot slower than video frequency. An example for this is the global adjustment of the angle of attack of an aircraft model in the windtunnel. An analysis in video real time would normally be nonsense for such tasks. Instead, an analysis of one frame per second seems sufficient. Still this would be a "real time" control task. Usually, however, real time tasks have a requirement for extremely high computing power and optimal programming: all functions have to be optimized for certain tasks. Please call the picCOLOR development team for information on special functions and solutions.

Marker Tracking

Markers can be any distinguishable areas on the surface that are detectable by their gray level, dark or bright. These may be little pieces of paint or adhesive tape, or little light bulbs or LEDís. The center of the markers will be determined at sub-pixel accuracy by measuring the center of gravity of the pixel area. Of course the markers should not change their geometry too much when viewed from different angles. An accuracy of 1/10 pixel length can be achieved when the markers have at least a diameter of 10 pixel. Smaller marker diameters increase the processing speed, while larger markers result in higher resolution of the detected coordinates.
The resolution of the tracking depends on the object/marker size, on camera resolution, and on camera arrangement. Regular CCD video cameras have a resolution of 768*576 pixel (CCIR). At the optimum subpixel resolution of 1/10 pixel a resolution of approximately 7680 units per image is possible in horizontal direction. Higher resolution cameras can be used, for example 1280*1024 for an approximate 12800 unit resolution. For 3-dimensional tracking the resolution also depends on the arrangement of the cameras. A large stereo angle is better for higher depth resolution. Actual resolution can be determined from conversion of the pixel units to real space dimensions. If, for instance, images of 1000 mm horizontal extension are acquired, the horizontal 2-dimensional resolution would be approximately 0.13 mm, using the regular video camera.
The arrangement of the measurement system is very simple, just set up one or two cameras for 2D- or 3D-measurement, define some reference positions by using a known set of reference points, let the system calculate transformation matrices for a 3-dimensional reconstruction, check the reconstruction using the known reference points, and start the measurement. If 6 reference points are known, then not even the camera positions have to be determined as the system can determine them from the 12 images of these points in the two camera views. Results can be output as 3D-coordinates of all marker points or as translations and rotations of the complete object as it is defined by the markers.
The detected positions of the markers can be used to control any hardware. This could be a model support control unit in a windtunnel or any other device that is controllable by a computer. Data transfer to other programs is done by using the DDE protocol of WINDOWS.

Marker Tracking Parameters

Marker tracking parameters can be set up in a dialog box with following selections:


Fig.1,2: Example: Tracking of the joint positions of large mammals for investigation of motion physics


Fig.3,4: Motion of joint positions during fast walking / Hip joint motion during foot lift off

During tracking, an error code will be determined showing the status of the tracking for all markers. Depending on the error code, the marker position may be unsave or completely wrong. The control program receiving the positions and the error code can react usefully when evaluating the error code. The following codes are implemented in the actual software:


If positions of two cameras and their optical characteristics are known exactly, a 3- dimensional reconstruction is very straight-forward. From two camera views of a certain marker in 3-dimensional space its location can be determined by regarding the images as result of certain translations and rotations and a final projection on the image plane. After calculating transformation matrices for both cameras the transformation equations for each marker image can be constructed, giving an over-defined equation system that can be solved by a "least square"-method. A disadvantage of this direct method is the fact that normally camera positions are not known very exactly. Especially the viewing angles and the rotations about the optical axis of the cameras can only be measured approximately.
In this case a different approach can be used. The transformation matrices can be determined without knowing the camera positions if 6 spatial reference points are known and their images can be detected in both camera views. For each camera position a homogenous system of 12 equation with 12 variables can be defined from which a system of inhomogenous equations can be derived and solved by a Gauss-algorithm with post-iteration.
After successfully calculating the transformation matrices, the 3-dimensional coordinates of any other spatial points can be detected if their images can be found (tracked) in both camera views. A set of points can be defined as object and the motion of this object (translation and rotation) can be determined. Normal procedure of calibration and initialization is outlined here:


Fig.5: Sketch of the 3D-Camera-Position dialog box

3D-Parameter Set-Up: Camera positions known

3D-Parameter Set-Up: Camera positions not known - using 6 reference points


Fig.6: Sample object (parallel epiped) with 7 bright LED's at known 3D-points for reference and calibration


Fig.7: Same parallel epiped with 7 known points after binarization, in two stereometric views

Back to FIBUS Home Page

Back to Image Processing

Copyright © 2001 The FIBUS Research Institute, Dr. Reinert H. G. Mueller;