Magnetic Platform for Planar Manipulation: January 2013

January 31, 2013

Ball detection - MEANshift

/Dropbox/MATLAB/detect_meanshift
We have been directed to test a camshift detector for object tracking, which is even able to deal with changing object scale and orientation. After getting a light insight into it, it was obvious that we will only need a part of the camshift algorithm in our application, which is the meanshift algorithm. This simplification can be made due to the character of our scene. We need to track a ball, whose distance from the camera can be considered constant after some simplification (the camera is far enough from the surface of the magnetic platform and when the ball moves from the center of the platform to the border, the difference of distances from the camera in both cases is negligible with respect to the absolute value of distances).

MEANshift implementation

The algorithm was implemented in a Simulink model 'detect_s' using a Level-2 MATLAB S-function. In the beginning, the algorithm has to be initialized by histograms of HSV intensities of pixels that depict the object that should be tracked. In this phase, a user input is required. Single frames are displayed in a loop and the user selects the rectangular area in the image that contains the object. The histograms of HSV intensities are then updated using the pixels from the selected area. We used a square colored object for testing purposes, so that the rectangular area was not a problem. For a ball, however, it would be necessary to add masking with a circular binary mask that would remove the background pixels from the selected rectangular area.

The HSV histograms computed from the areas selected by the user are normalized after the initialization phase and they are used as discrete probability densities from that point. The processed frame is transformed into probability image using the discrete HSV probability densities as lookup tables for HSV intensity values of the pixels. (This is a backprojection of probability into the original image.) Afterwards, we enter the meanshift algorithm, which iteratively computes center of mass of the region of interest in the sense of probability and updates the center of the region of interest. The update is based on computation of zero- and first-order spatial moments of the probability image denoted as M00, M10 and M01.

$M_{00} = \displaystyle\sum_x\sum_y I(x,y), \\ M_{10} = \displaystyle\sum_x\sum_y x\times I(x,y), \\ M_{01} = \displaystyle\sum_x\sum_y y\times I(x,y),$

where I(x,y) is probability that HSV intensities at position (x,y) belong to the tracked object. The coordinates of the center of mass of the region of interest (and therefore also the updated coordinates of the center of the region of interest) are computed as

$x_c = \frac{M_{10}}{M_{00}}; y_c = \frac{M_{01}}{M_{00}}$

After the iterative process has converged, the estimate of position of the tracked object is the same as the center of the region of interest.

MATLAB script

/Dropbox/MATLAB/detect_meanshift/MEANshift_test_script.m
First, the algorithm was implemented as a MATLAB script, getting single snapshots from the Creative Live! webcam one by one. (The script itself in fact only adjusts the exposure settings of the camera and then calls the function 'MEANshift_m_function', which implements the meanshift algorithm. The main advantage and purpose of doing this was the simplicity of implementing a MATLAB script which made debugging quite comfortable. The main disadvantage was the communication with the webcam, which fell asleep after each frame acquisition, so it took nearly a second until it was possible to get another frame. The script also shows how to adjust exposure settings of the webcam through the interface of Image Acquisition Toolbox interface.

Probability image created by backprojection of HSV histograms into original image

We had slight problem with illumination of the scene, because the lab is illuminated by fluorescent tubes that flash at 50Hz, which is the frequency of electric network. However, the camera acquires images asynchronously with respect to this frequency, which causes that the frames differ in exposition (some of them are underexposed). The meanshift algorithm is robust enough to deal with these changes, but it has a negative impact on the precision of tracking.

Convergence of the meanshift algorithm when tracking a test object.
Previous detected position is marked with green cross (previous region of interest with green rectangle),
actual position is marked with blue cross (actual region of interest with red rectangle)

Simulink implementation with Level-2 MATLAB S-function block

/Dropbox/MATLAB/detect_meanshift/detect_s.slx

We used the Level-2 MATLAB S-function block in Simulink because unlike the MATLAB function block, it has the capability to store its internal state between iterations. Therefore the whole implementation including the initialization of the histograms can be integrated in the S-function block. The S-function is implemented in file 'MEANshift_s_function'.

Simulink model with Level-2 MATLAB S-function integrates all functionality into the S-function block

Simulink implementation with MATLAB function block

/Dropbox/MATLAB/detect_meanshift/detect.slx

The main disadvantage of the Level-2 MATLAB S-function block is that it does not support automatic code generation for execution of the model in external mode (e.g. using RealTime Toolbox or xPC Target). To use such block in a model executed in external mode, one has to manually create a TLC file for the S-function. The TLC file tells the compiler how to convert model into a *.c file. It would be too complicated to create the TLC file, so we transformed the model to use a MATLAB function block, which supports automatic code generation. The initialization of the histograms was moved into InitFcn callback function of the model and we also had to add memory that stores the actual state of the algorithm into the model.

Simulink model with MATLAB function, memory of the algorithm is realized by unit delays in feedback loop.

Implementation of background subtraction in Simulink with Basler acA2000-340kc camera

/Dropbox/MATLAB/detect_bg_subtraction_cl
The implementation was done in exactly the same way as with the Creative Live! webcam. The reason for making an extra version is different resolution of the image from the camera. The width and height of the video frames appear as constants in quite a lot of locations in the S-function, so it was easier to make an extra version of the same function instead of rewriting the constants in the S-function every time we want to switch from one camera to another.

Region of interest with the estimated position of the center of the ball

Screenshot from live video with estimated center of the ball marked with green square

January 30, 2013

Implementation of background subtraction in Simulink with Creative Live! webcam

/Dropbox/MATLAB/detect_bg_subtraction
The background subtraction was implemented as a Simulink model using Level-2 MATLAB S-function block with the Creative Live! webcam. The implementation uses Image Acquisition Tolbox.
During the initialization, we compute mean image. We use mean instead of median, because unlike the median, mean can be computed iteratively without storing all the data in memory. The number of frames used for averaging can be set inside the S-function as the constant 'nsamples'. After the initialization, the algorithm is paused and we place the object to be tracked into the field of view (FOV) of the camera. To save some computing power, we only search for the object in a restricted region of interest (ROI). The position of ROI is stored and updated in every iteration of the algorithm.

Simulink model implementation of background subtraction

We have made a modification to the background subtraction algorithm described in the article Ball detection-background subtraction. We have left out the thresholding phase, as it turned out to be unnecessary.

Region of interest with estimated position of the center of the ball

The estimation of ball position is not very precise. The influence of the shade thrown by the ball on the picture above is obvious.

Screenshot from live video with estimated center of the ball marked with green cross

Ball detection - background subtraction

/Dropbox/MATLAB/detect_thresholding
We have the option to take advantage of the static background of the scene. We can assume that it is only the ball that is moving in the scene. As a result, we can compute mean background and threshold the absolute difference of the mean background from the actual image. The difference is thresholded using two thresholds. The classification of every pixel follows the rules

$difference \leq threshold_{low} \dots class = -1$

$threshold_{low} < difference \leq threshold_{high} \dots class = 0$

$difference > threshold_{low} \dots class = 1$

Each row and column of the image of classified pixels is evaluated with the sum of classes of its elements in the next step. These sums are filtered by a convolution kernel that is generated as an orthogonal projection of circle to a line. The projected circle has the same size as the ball in the image. Finally, the horizontal and vertical coordinate of the ball are determined as points of maximal value of filtered sums of columns and rows, respectively.

Pixel classifications together with the sums of columns and rows
(classification color map is red=1, green=0 and blue=-1)

When we compare the estimated position of the center of the ball with the real position, it is clear that the estimation is quite biased. The bias is caused by strong reflections on the surface of the ball as well as by the shades that are thrown by the ball onto the surface of the magnetic platform.

Comparison of estimated and real position of the center of the ball

The tests were done with subsampled image frames acquired with monochrome Camera Link camera Basler acA2000-340km.

Ball detection - Harris detector

/Dropbox/MATLAB/detect_harris
The idea to try Harris detector originated from attempting to take advantage of the strong reflections on the surface of the ball instead of trying to avoid their influence. Harris detector is a tool that finds significant points in the image, which are usually the corner points. Even though our objects is a ball without any corners, the reflections on its surface form very steep intensity gradients that can be detected by the Harris detector. The background does not have such strong contrasts, so if we set a threshold for minimal value of local maxima of cornerness high enough, the detector only detects the corner points formed by reflections on the ball. The estimate of position of the center of the ball is then computed as median of coordinates of detected significant points. The median is used to avoid influence of a few corner detections that occur in the background, which would strongly bias the estimate if we used mean instead of median.

The algorithm was tested using video frames from monochrome Camera Link camera. The accuracy of the method is strongly dependent on the surrounding and illumination of the experiment. If the illumination is asymmetrical, the estimate is biased. For a symmetrical illumination, the estimate could be quite precise. However, the Harris detector requires computation of intensity gradients in the image, so it spends quite a lot of computing power and therefore is not very suitable for our application.

Estimation of ball position based on the significant image points detected by Harris detector
(significant image points marked with green squares, ball position with red markers)

January 29, 2013

Ball detection - intensity thresholding

We need a simple, robust and fast detection method for our application, because the higher precision and sampling frequency of detection we achieve, the better can be the control using optical feedback. We have a static scene with approximately constant illumination and our aim is to track a moving steel ball in this scene. The first attempts were implemented and tested using video frames that were transferred to the host PC over the UDP protocol, because it is practically impossible to save a full-resolution video on the local target hard drive. In addition, we also tested the implemented algorithms in Simulink models on PC using Image Acquisition Toolbox together with a USB webcam. In the time of these experiments, we had only monochrome Camera Link camera in our laboratory. However, the principles can be later broadened to RGB color video signals as well.

Intensity thresholding

Our very first idea was a simple thresholding of intensity image, which would be the simplest and fastest detection method. This method is, however, too simple for our application, because the steel ball has quite strong reflections and both ball and the background contain wide range of intensities that overlay each other. Therefore the object is not separable from the background using only single pixel intensity values. An improvement for the future could be to paint the ball with a color and take advantage of the color camera by e.g. thresholding a single color layer. In addition, with respect to the future application, it would be useful to have balls of different colors to be able to distinguish them.

Even for a human, it is difficult to identify the ball in the image precisely