Fast connected components labeling

Labeling connected components in an image is a common operation. But the original algorithm proposed is slow. It works fine if the image is small. But as the image becomes larger, the algorithm slows down really fast. Recently, a few researchers find a simple solution to this. Image that previously took hundreds of seconds would not take only a couple of seconds to get labeled.

The original connected components labeling algorithm

A binary image has only ones and zeroes. The goal of labeling connected components is to identify which ones are “connected”.

For example, if the above image, the goal is to identify that the two circles on the left are connected. Sure, they are separate circles, but they are “connected”. They share the same boundary. Similarly for the two circles on the right.

The original algorithm is a two pass algorithm. The first pass marks labels for each pixel and the second pixel marks corrects invalid labels. You might want to read more about the connected components labeling algorithm. You might also want to check an example on how the connected components labeling algorithm works.

The divide and conquer approach

The main bottleneck of the algorithm is the large equivalence array (the union-find structure). Calculating and resolving equivalent labels for one big image takes a lot of time. However, if the image is divided into several regions (say the image is split into 3×3 parts), then the labels can be computed and resolved much faster.

Spliting an image into multiple=

After splitting, you execute the standard algorithm on each part. Now you need to resolve any anomalies that might exist at the borders of each part.

This resolution is done in three parts:

  1. Check the top-left pixel of each part (this pixel connects three other parts: to the left, above and the one diagonally)
  2. Check the leftmost column of pixels (this column connects the current part to the left)
  3. Check the topmost column of pixels (this column connects the current part to the part above it)

For each case, you fix any wrong labeling that might exist.

The next step is to determine the number of pieces the image is split in. Experimentally, it was found that each piece should be 30*30 pixels – 60*60 pixels in size. That way, the equivalence array is small enough to be computed rapidly.

Results

The researchers did experiments with a 1760*1168 image. This image is HUGE. The standard algorithm would take a lot of time. It simply isn’t feasible to calculate the time it takes.

On dividing, the following times were obtained:

  • 4×4 split: 274.57 seconds
  • 5×5 split: 160.22 seconds
  • 10×10 split: 12.47 seconds
  • 15×15 split: 4.01 seconds
  • 20×20 split: 2.75 seconds
  • 25×25 split: 2.47 seconds

They also compared it against other methods of computing labels. This algorithm clearly surpassed the results any other algorithm could produce.

Summary

You learned about a new connected component labeling algorithm that computes labels for large images really fast. Have a look at the actual paper for pseudo code on how to implement it.

Back to top

9 Comments

  1. ashwini
    Posted October 19, 2010 at 12:28 pm | Permalink

    i’m a pg student. my project is based on segmentation. now i have to label the image. can any one help me pls to write the code in matlab for connected component labelling?

    thanks

    • Posted October 19, 2010 at 4:42 pm | Permalink

      I think Matlab already has some functions that label connected components in an image.

  2. ashwini
    Posted October 20, 2010 at 1:05 pm | Permalink

    hi, thanks for the reply. is there any matlab code to convert an image to a graph. i.,e. i should represent an image interms of a graph. can i do it in matlab?

    thank you

  3. Kishor
    Posted June 15, 2011 at 12:17 pm | Permalink

    Hi Utkarsh!! can u suggest how I can use connected component concept to segment words from a scanned document. (using openCV)

    • Posted June 17, 2011 at 4:28 pm | Permalink

      Not sure what you mean, but you could use connected components to find letters. Everything that is connected definitely belongs to a single letter. Then you can work on characters like ‘i’ so that nearby components are a part of a single letter. Then you can get close by letters into a word.

  4. shilan
    Posted September 2, 2011 at 4:46 pm | Permalink

    sorry but i have a problem, I want to lable 3 component at the same time, not a binary image. what can i do then?

  5. shilan
    Posted September 23, 2011 at 5:53 pm | Permalink

    you know my image contains 0s, 1s and .5s. is it still possible with this algorithm? where is the code of this algorithm?

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">