Background-informed color visual models

When tracking nonrigid objects like persons’ bodies or hands, color histograms, sampled from elliptical regions, provide a convenient way to encode the visual properties of these objects. However, when the color of the background is similar to the color of the tracked object, the classical approach of color histograms will fail. This is illustrated below with an image of a white person on a white background. Note that the measure of presence is severely noisy and multi-modal, therefore the person’s position cannot be easily determined.

To improve detection in situations like depicted above, we have modified the measure of presence through use of an estimate of the background image and the normalized Hellinger distance. This measure is depicted in the image below.

Motion-informed color visual models

The purely intensity-based measure of presence cannot deal with situations in which the tracked object is occluded by another, visually-similar object. The cause of failure is the color ambiguity which is shown in the figure below: here we see an example of tracking a hand, and along with it we also see an example of the color-based likelihood function, which corresponds to the histogram of the tracked hand. Note that the likelihood function is bimodal; one mode corresponds to the tracked hand and the other mode corresponds to the other hand. This in effect introduces the ambiguity in the hand’s position and causes tracker to fail.

We therefore propose to use the optical flow to resolve the color-induced ambiguity. An example of this is shown below, where we use the notion that the tracked hand is moving to the left (provided by the tracker) and the other hand is not. By comparing the direction of the hand’s motion to the observed optical, we can generate the local-motion-based likelihood function. When this function is combined to the color-based likelihood function, only the mode which corresponds to the tracked hand remains. Thus the ambiguity is resolved.

A two-stage dynamic model

A two-stage dynamic model, is a composition of two separate dynamic models: the liberal and conservative dynamic model. The liberal model explicitly modells the correlation in the target’s velocity by a nonzero-mean Gauss-Markov (GM) process and allows greater perturbation in the target’s velocity. The mean of the GM process is then estimated using a conservative estimate of the current target’s velocity. A particular composition of the two models allows using extremely low number of particles in the particle filter with improved performance in comparison the two widely-used dynamic models: the random walk and the nearly-constant velocity models.

An example of tracking a person from a moving camera is shown below. In this experiment we have used a, bootstrap particle filter, the color-based visual model which uses background to improve tracking and the two-stage dynamic model. The number of particles in the particle filter was set to only 25 particles.

Multiple interacting targets

In certain setups, e.g., a top-down view, constraints can be emposed on the joint state estimation in particle filters that simplify a tracking iteration. We approach the problem by estimating a partitioning of the field of view into nonoverlapping cells, each cell containing a single target. Given the partitioning, the tracking iteration decomposes into per-target independent trackers. The algorithm is composed of space partitioning step and tracking iteration. The visual and dynamic models described above significantly improve this tracker.

Here is an output of one version of this algorithm applied to multiple target tracking in sports, which is used by the Faculty of sports sciences UNI-LJ for various team sports analysis.

Multiple interacting target tracking

Researchers

Background-informed color visual models

Motion-informed color visual models

A two-stage dynamic model

Multiple interacting targets

Publications