last update 04/12/2001
2D/3D Real Time Object Tracking
Many imaging projects are based on the tracking of objects or markers on objects in 2- or 3-
dimensional space at high speed. The picCOLOR Real Time Position Tracking extension
module was developed to meet these requirements. The functions of this extension module
allow the definition of markers and the set up of camera locations and transformation
equations for reconstruction of 3-dimensional coordinates from two camera views.
Normally the speed of the analysis is a very important question. The functions of the
extension module are optimized to run at highest possible speed for real-time applications.
Depending on the type of video camera high tracking frequencies can be realised. With a
normal CCIR camera running at 25 Hz, a 3-dimensional tracking frequency of 12.5 Hz is
possible with the actual software version, i.e. every other image can be analysed. For higher
speed requirements special cameras can be used at tracking speed of up to 40 Hz for 3-
dimensional tracking or even 90 Hz for 2-dimensional tracking. In a future updated version of
the program even full video frequency can be used, i. e. up to 25 Hz with regular cameras or
higher with special cameras. Of course the functions of the extension module can also be
used for post-processing of already loaded or recorded video sequences or single frames.
A few words on "real time": "real time" is a widely used - or misused - term in modern high
speed computing. What does it mean? Does it mean to finish a certain calculation extremely
fast, or analyse extremely quick events? - Not at all: "real time" just means to analyse an
event at exactly the time as it is taking place, may it be slow or fast. This enables to react on
a special action or to control a process. Therefore, the first thing to do is to define or find out
how fast a process is going to happen and how fast a reaction must be to be able to perform
any control function. An often used definition is "video real time". Regular video frequency is
25Hz for European CCIR video standard. If a process can not be dissolved at this frequency,
like for instance a high frequency aircraft wing model flutter problem, a special high speed
camera has to be used. On the other hand there are many processes that are a lot slower
than video frequency. An example for this is the global adjustment of the angle of attack of
an aircraft model in the windtunnel. An analysis in video real time would normally be
nonsense for such tasks. Instead, an analysis of one frame per second seems sufficient. Still
this would be a "real time" control task. Usually, however, real time tasks have a requirement
for extremely high computing power and optimal programming: all functions have to be
optimized for certain tasks. Please call the picCOLOR development team for information on
special functions and solutions.
Markers can be any distinguishable areas on the surface that are detectable by their gray
level, dark or bright. These may be little pieces of paint or adhesive tape, or little light bulbs
or LEDís. The center of the markers will be determined at sub-pixel accuracy by measuring
the center of gravity of the pixel area. Of course the markers should not change their
geometry too much when viewed from different angles. An accuracy of 1/10 pixel length can
be achieved when the markers have at least a diameter of 10 pixel. Smaller marker
diameters increase the processing speed, while larger markers result in higher resolution of
the detected coordinates.
The resolution of the tracking depends on the object/marker size, on camera resolution, and
on camera arrangement. Regular CCD video cameras have a resolution of 768*576 pixel
(CCIR). At the optimum subpixel resolution of 1/10 pixel a resolution of approximately 7680
units per image is possible in horizontal direction. Higher resolution cameras can be used, for
example 1280*1024 for an approximate 12800 unit resolution. For 3-dimensional tracking the
resolution also depends on the arrangement of the cameras. A large stereo angle is better for
higher depth resolution. Actual resolution can be determined from conversion of the pixel
units to real space dimensions. If, for instance, images of 1000 mm horizontal extension are
acquired, the horizontal 2-dimensional resolution would be approximately 0.13 mm, using the
regular video camera.
The arrangement of the measurement system is very simple, just set up one or two cameras
for 2D- or 3D-measurement, define some reference positions by using a known set of
reference points, let the system calculate transformation matrices for a 3-dimensional
reconstruction, check the reconstruction using the known reference points, and start the
measurement. If 6 reference points are known, then not even the camera positions have to
be determined as the system can determine them from the 12 images of these points in the
two camera views. Results can be output as 3D-coordinates of all marker points or as
translations and rotations of the complete object as it is defined by the markers.
The detected positions of the markers can be used to control any hardware. This could be a
model support control unit in a windtunnel or any other device that is controllable by a
computer. Data transfer to other programs is done by using the DDE protocol of WINDOWS.
Marker Tracking Parameters
Marker tracking parameters can be set up in a dialog box with following selections:
- Set a useful length unit to calibrate the system, like Meters [m], or Inches [in].
- Acquire or load reference image (or two images) with markers in reference positions.
- Estimate approximate diameter of markers (in pixel size) with the mouse pointer
- Open the "Object Tracking Parameters" dialog box and define the following settings:
- Number of markers to be searched (count all marker images in case of two camera views)
- Set approximate diameter of the markers (all markers should have similar diameter for this software version)
- Select a simple 2-dimensional detection or a 3-dimensional reconstruction. In case of a 3-
dimensional reconstruction select the split-screen mode or the two-screen mode. Depending
on the hardware, the screen can be split vertically or horizontally, as defined in the "Split-
Screen"-menu in the "Acquire"-menu. For two-screen mode select an additional image buffer
for the second view.
- Define the maximum allowable pixel-shift of the markers from one image to the next one in
the sequence. If the marker will move more than by this value within one frame time step,
the marker can not be found anymore and an error condition will be shown.
- Set marker validation: This is used to ensure the recognition of the correct marker images.
This recognition is based on overall pixel area of the markers. Markers are rejected if area is
larger or smaller by more than a factor of two. For very small markers (less than 3 or 4 pixel
diameter) this technique does not work anymore and should be switched off.
- Lost marker extrapolation: Markers can be tracked even if they are covered by other
objects for short time intervals. To do this, the last savely determined marker speed and
direction is used. When the marker reappears, a reattachment is highly possible for
stationary moving markers. Turn this off for chaotically moving markers.
- Now define the marker positions by clicking the center of the markers with the left mouse
button. For a 3-dimensional reconstruction the markers in both partial images have to be
defined in the same order. Otherwise an attachment of the correct markers in both images is
not possible. After this initialization a 2-dimensional tracking can be startet. For a 3-
dimensional tracking the reconstruction parameters have to be initialized before any
measurement as explained below.
Fig.1,2: Example: Tracking of the joint positions of large mammals for investigation of motion physics
Fig.3,4: Motion of joint positions during fast walking / Hip joint motion during foot lift off
During tracking, an error code will be determined showing the status of the tracking for all
markers. Depending on the error code, the marker position may be unsave or completely
wrong. The control program receiving the positions and the error code can react usefully
when evaluating the error code. The following codes are implemented in the actual software:
- 0: marker correctly detected, positions are save.
- 1: marker detected to be close to the image border and may be leaving the image very soon, positions are save.
- 2: marker too large or too small (by more than a factor of 2), or touches the border of the
search rectangle, i.e. marker has moved to much within one frame time step. Positions are unsave but may be still correct.
- 3: marker not found. Do not use positions.
If positions of two cameras and their optical characteristics are known exactly, a 3-
dimensional reconstruction is very straight-forward. From two camera views of a certain
marker in 3-dimensional space its location can be determined by regarding the images as
result of certain translations and rotations and a final projection on the image plane. After
calculating transformation matrices for both cameras the transformation equations for each
marker image can be constructed, giving an over-defined equation system that can be
solved by a "least square"-method. A disadvantage of this direct method is the fact that
normally camera positions are not known very exactly. Especially the viewing angles and the
rotations about the optical axis of the cameras can only be measured approximately.
In this case a different approach can be used. The transformation matrices can be
determined without knowing the camera positions if 6 spatial reference points are known and
their images can be detected in both camera views. For each camera position a homogenous
system of 12 equation with 12 variables can be defined from which a system of
inhomogenous equations can be derived and solved by a Gauss-algorithm with post-iteration.
After successfully calculating the transformation matrices, the 3-dimensional coordinates of
any other spatial points can be detected if their images can be found (tracked) in both
camera views. A set of points can be defined as object and the motion of this object
(translation and rotation) can be determined. Normal procedure of calibration and
initialization is outlined here:
Fig.5: Sketch of the 3D-Camera-Position dialog box
3D-Parameter Set-Up: Camera positions known
- Input a system calibration unit: [m], [in] or other.
- Open "3D_CameraPosition"-dialog box and select "Set Camera Positions"
- Input x,y,z-positions for both cameras
- Input x,y,z-view point position for both cameras (center of interest): This is the center of the camera image (on the
optical axis) at the plane of focus (the view point plane).
- Input calibration factors (pixel/calibration-unit) for both cameras for the view point plane.
- Input rotation about the optical axis for both cameras [rad], right hand system.
- Determine the transformation matrices by hitting the "Start"-button
- Check the transformation matrices by plotting some known reference points as 2-
dimensional projections on the screen. They should fit more or less exactly on the acquired
marker images of these known reference points. Errors from setting the view point plane and
camera rotation about the optical axis can be corrected using these reference points.
3D-Parameter Set-Up: Camera positions not known - using 6 reference points
- Input of a system calibration unit: [m], [in] or other.
- Define the 12 marker positions of the 6 known spatial points in the two camera views
interactively by using the "Define Marker"-function. (After setting all marker parameters)
- Open "3D_CameraPosition"-dialog box and select "6 Point Reconstruction"
- Input the 6 known spatial points with x,y,z-coordinates in the selected calibration unit.
- Hit the "Transfer 12 Points" radio button.
- Determine the transformation matrices by hitting the "Start"-button
- Check the transformation matrices by plotting the 6 reference points as 2-dimensional
projections on the screen. They must fit exactly on the original marker images.
Fig.6: Sample object (parallel epiped) with 7 bright LED's at known 3D-points for reference and
Fig.7: Same parallel epiped with 7 known points after binarization, in two stereometric views
Back to FIBUS Home Page
Back to Image Processing
Copyright © 2001 The
FIBUS Research Institute, Dr. Reinert H. G. Mueller;