SuDoKu Grabber with OpenCV

In this post,we ‘ll look at detecting a SuDoKu puzzle. This include all preprocessing done on the image: filtering the image to ensure we’re not affected too much by noise. Also, segmenting the image is dealt with here. I’ve used a weird segmentation approach, so you might want to have a look at that. By the end of this post, you’ll have several possible lines that describe the puzzle grid.

Getting started

Start by creating a new project in your IDE. If you’re not sure how that is done, have a look at the Getting started with OpenCV guide.

Also, I’ll use OpenCV’s C++ interface. So make sure you have at least OpenCV 2.0 installed on your computer.

Link your project to the OpenCV library files and include the following in your main file:

#include <cv.h>
#include <highgui.h>

int main()
{
    return 0;
}

For now, we’ll use a static image for detecting a puzzle. So we load an image:

int main()
{
    Mat sudoku = imread("sudoku.jpg", 0);

An image with a sudoku puzzle

Note that we load the image in grayscale mode. We don’t want to bother with the colour information, so just skip it. Next, we create a blank image of the same size. This image will hold the actual outer box of puzzle:

    Mat outerBox = Mat(sudoku.size(), CV_8UC1);

Preprocessing the image

Blur the image a little. This smooths out the noise a bit and makes extracting the grid lines easier.

    GaussianBlur(sudoku, sudoku, Size(11,11), 0);

With the noise smoothed out, we can now threshold the image. The image can have varying illumination levels, so a good choice for a thresholding algorithm would be an adaptive threshold. It calculates a threshold level several small windows in the image. This threshold level is calculated using the mean level in the window. So it keeps things illumination independent.

    adaptiveThreshold(sudoku, outerBox, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, 5, 2);
After blurring and  thresholding

After blurring and thresholding the puzzle

It calculates a mean over a 5×5 window and subtracts 2 from the mean. This is the threshold level for every pixel.

Since we’re interested in the borders, and they are black, we invert the image outerBox. Then, the borders of the puzzles are white (along with other noise).

    bitwise_not(outerBox, outerBox);

This thresholding operation can disconnect certain connected parts (like lines). So dilating the image once will fill up any small “cracks” that might have crept in.

    Mat kernel = (Mat_<uchar>(3,3) << 0,1,0,1,1,1,0,1,0);
    dilate(outerBox, outerBox, kernel);

Note that I’ve used a plus shaped structuring element here (the kernel matrix).

After inverting and dilating the puzzle

After inverting and dilating the puzzle

Finding the biggest blob

For this project, I didn’t want to use any library for blobs. So I made a little hack for detecting blobs. If you want, you can use cvBlobsLib.

Here’s the technique I use. First, I use the floodfill command. This command returns a bounding rectangle of the pixels it filled. We’ve assumed the biggest thing in the picture to be the puzzle. So the biggest blob should have be the puzzle. Since it is the biggest, it will have the biggest bounding box as well. So we find the biggest bounding box, and save the location where we did the flood fill.

    int count=0;
    int max=-1;
    Point maxPt;
 
    for(int y=0;y<outerBox.size().height;y++)
    {
        uchar *row = outerBox.ptr(y);
        for(int x=0;x<outerBox.size().width;x++)
        {
            if(row[x]>=128)
            {
                 int area = floodFill(outerBox, Point(x,y), CV_RGB(0,0,64));
 
                 if(area>max)
                 {
                     maxPt = Point(x,y);
                     max = area;
                 }
            }
        }
    }
Flood filling each blob (in progress)

Flood filling each blob (in progress)

We iterate through the image. The >=128 condition is to ensure that only the white parts are flooded. Whenever we encounter such a part, we flood it with a dark gray colour (gray level 64). So in the future, we won’t be reflooding these blobs. And whenever we encounter a big blob, we note the current point and the area it has.

Now, we have several blobs filled with a dark gray colour (level 64). And we also know the point what produces a blob with maximum area. So we floodfill that point with white:

    floodFill(outerBox, maxPt, CV_RGB(255,255,255));

Now, the biggest blob is white. We need to turn the other blobs black. We do that here:

    for(int y=0;y<outerBox.size().height;y++)
    {
        uchar *row = outerBox.ptr(y);
        for(int x=0;x<outerBox.size().width;x++)
        {
            if(row[x]==64 && x!=maxPt.x && y!=maxPt.y)
            {
                int area = floodFill(outerBox, Point(x,y), CV_RGB(0,0,0));
            }
        }
    }

Wherever a dark gray point is enountered, it is flooded with black, effectively “hiding” it.

Because we had dilated the image earlier, we’ll “restore” it a bit by eroding it:

    erode(outerBox, outerBox, kernel);
    imshow("thresholded", outerBox);
The biggest blob, after morphological erosion

The biggest blob, after morphological erosion

Detecting lines

At this point, we have a single blob. Now its time to find lines. This is done with the Hough transform. OpenCV comes with it. So a line of code is all that’s needed:

    vector<Vec2f> lines;
    HoughLines(outerBox, lines, 1, CV_PI/180, 200);

For now, we’ll draw each line. Just to see if the results too now are good enough or not:

    for(int i=0;i<lines.size();i++)
    {
        drawLine(lines[i], outerBox, CV_RGB(0,0,128));
    }

Where, the drawLine function is:

void drawLine(Vec2f line, Mat &img, Scalar rgb = CV_RGB(0,0,255))
{
    if(line[1]!=0)
    {
        float m = -1/tan(line[1]);
        float c = line[0]/sin(line[1]);
 
        cv::line(img, Point(0, c), Point(img.size().width, m*img.size().width+c), rgb);
    }
    else
    {
        cv::line(img, Point(line[0], 0), Point(line[0], img.size().height), rgb);
    }
}

This function takes a line in the normal form (a distance from the original and angle with the x-axis). Then, if the line is vertical (infinite slope), it draws the line appropariately. If not, it finds two points on the line and draws a line accordingly.

Lines detected by the hough transform

Lines detected by the Hough transform

As you can see, each physical line has several possible approximations. This is usually because the physical line is thick. So, just these lines aren’t enough for figuring out where the puzzle is located. We’ll have to do some math with these lines. But we’ll do that in the next post. I think this one has been long enough.

Summary

Today, we implemented the first half of the SuDoKu grabber. We’ve been able to detect the physical borders of the puzzle till now, but the results aren’t usable directly. We’ll do a little math with them and fix that next.


Back to top

Summary

  • We're working on something cool!
  • To make this work, we're making a few of (reasonable) assumptions
  • There are two key tasks - Extracting the grid and recognizing the digits

59 Comments

  1. Posted August 25, 2010 at 6:17 pm | Permalink

    What’s the motive behind it ? Just learning to perform basic tasks ?
    What will be direct application ?

    I mean to initiate discussion here basically… !!!

    • Posted August 25, 2010 at 6:34 pm | Permalink

      Hi! Motive? Well.. I’ve seen a few people ask for it on forums. So thought I’d put up my take on the problem.

      Also, doing things that one can relate to are often much better for learning. Take line detection for example. Tell a person about line detection and he won’t be impressed. Relate it to a sudoku puzzle and they’ll remember it forever!

      I think I’ll end this series with a SuDoKu solver. Take a snap of a puzzle in your newspaper and it shows your the solution :)

  2. kay_hao
    Posted August 25, 2010 at 7:16 pm | Permalink

    Wonderful idea!
    I will follow u!

  3. Ankit Malpani
    Posted August 26, 2010 at 1:54 pm | Permalink

    innovative idea man !

    problems : what if there are 2 squares surrounding the actual puzzle, i think in sudoku’s which come in hindu, there are 2 enclosing squares. So, in step 2- it may detect the outer square and then dividing into 9 lines horizontally and vertically may not give you the actual grid right?

    waise, once this is done , how bout adding some more code to solve the puzzle ? [atleast the easy ones for a start]

    • Posted August 28, 2010 at 8:30 am | Permalink

      Hmm.. Those might be a problem.. Will have to think of how to work with such pictures. Got any ideas? Yes, the app will solve a puzzle once it has successfully recognized it!

  4. champie
    Posted September 4, 2010 at 6:12 pm | Permalink

    Hey you have a nice blog!
    Can you give me an Ideas and what do i need in detecting available parking slots??? and object differencing( car, people and other objects) some functions/methods??.

    Thank you so much , you’re so genius!!

  5. Ymehdi
    Posted September 15, 2010 at 5:41 pm | Permalink

    Hello Boss,
    It’s so useful blog, I thank you for that,
    the problem of merging line poste above is so interesing. may I have the complete source code for meging lines closed by ?
    your help is greatly appreciated
    thanks alot
    Ymehdi

    • Posted September 20, 2010 at 12:24 am | Permalink

      Hi! The code for merging close by lines is right here :P

  6. Y.M
    Posted October 12, 2010 at 3:46 am | Permalink

    Great work boss

    I’ll steal time to look at this valuable blog

    bravo

  7. Abhi
    Posted October 12, 2010 at 8:28 am | Permalink

    Hi Utkarsh,
    I find your posts really innovative and informative.
    Please keep up the good work. Your thought process and the clarity of your explanation is worth admiring!
    Cheers!!

  8. arash
    Posted December 26, 2010 at 8:23 pm | Permalink

    thank you so much for your very useful blog and tutorial :)
    could you say more about training procedure?
    in this line of your code i got error.
    i should change the path “D:/Test/Character Recognition/train-images.idx3-ubyte”, “D:/Test/Character Recognition/train-labels.idx1-ubyte” to what?

  9. Posted December 27, 2010 at 1:27 am | Permalink

    Hi, Your tutorials are great. Thanks a lot. I am developing an ancient coins recognition system using opencv for my undergraduate final year project these days. I have a small question on this tutorial.
    Under “Finding the biggest blob” sub title

    uchar *row = outerBox.ptr(y);

    in this code, how did you get the ptr?

    Thanks

    • Posted January 11, 2011 at 7:38 pm | Permalink

      Hi! I’m not sure if I understood your question. What do you mean by get the ptr? Its a function in the new C++ interface.

      • Nadeeshani
        Posted January 12, 2011 at 10:11 pm | Permalink

        Do I have to use any include directive to use ptr? for the variable outerBox, ptr is not listed. When building the project… I get this error “struct_iplimage has no member named ‘ptr’ ”
        for

        uchar *row = dilImg->ptr(y);
        • Nadeeshani
          Posted January 12, 2011 at 10:43 pm | Permalink

          Still I couldnt find how to use new c++ interface.

          That line

          uchar *row = dilImg->ptr(y);

          is equal to this right?

          uchar *row = (uchar*)(dilImg->imageData + y*dilImg->widthStep);

          One more problem is cvFloodFill doesnt return a int value. It’s a void function. But this is something you have written

          int area = floodFill(outerBox, Point(x,y), CV_RGB(0,0,0));

          Can we return an int if we use new c++ interface?

  10. arash
    Posted December 27, 2010 at 6:42 pm | Permalink

    thank you again:)the result of the code you have written is something like this:
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0
    could you please explain?

    • Posted January 11, 2011 at 8:19 pm | Permalink

      Hmm… It seems like its not recognizing the digits properly.

  11. Ashwin
    Posted January 6, 2011 at 4:48 am | Permalink

    Hi,

    Good stuff, I was planning to do something similar for Android phone i.e. take a snapshot of puzzle and we would provide the solution for it.
    Would definitely use your idea for my application.

    cheers!!

  12. Posted January 11, 2011 at 6:20 am | Permalink
    • Posted January 11, 2011 at 9:37 am | Permalink

      Yup! In fact, there are dozens of similar apps for android and iPhone out there.

  13. Raingo
    Posted January 15, 2011 at 1:43 pm | Permalink

    Thank you very much for so many tutorials in this website, and it’s really useful to read through every tiny project.

    Just one advice for the “Finding the approximate bounding box” stage, I find one bug in the algorithms.

    when traverse through the image from the center in four directions, I think it is not an appropriate way to sum whole row or whole col, or this process will have no effect on the cell whose edges is full of original edge lines.

    Maybe the differences between your algorithm and my propose will illustrate this problem:

            int sumtemp=0;	
    	if(rowBottom==-1)
    	{
    	    sumtemp=0;
    	    for(int j(img.cols-i);j<i;++j)
    		sumtemp+=img.at(i,j);
    	    if(sumtemp<thresholdBottom || i==img.rows-1)
    		rowBottom=i;
    	}
     
    	if(rowTop==-1)
    	{
    	    sumtemp=0;
    	    for(int j(img.cols-i);j<i;++j)
    		sumtemp+=img.at(img.rows-i,j);
    	    if(sumtemp<thresholdTop || i==img.rows-1)
    		rowTop=img.rows-i;
    	} 
     
            if(colRight==-1)
            {
    	    sumtemp=0;
    	    for(int j(img.rows-i);j<i;++j)
    		sumtemp+=img.at(j,i);
                if(sumtemp < thresholdRight|| i==img.cols-1)
                    colRight = i;
            }
     
            if(colLeft==-1)
            {
    	    sumtemp=0;
    	    for(int j(img.rows-i);j<i;++j)
    		sumtemp+=img.at(j,img.cols-i);
                if(sumtemp < thresholdLeft|| i==img.cols-1)
                    colLeft = img.cols-i;
            }

    however, my propose leads to a more difficult problem, since it does not works for some certain digits like 7 or 5 for their specific shape!

    Maybe the search algorithm is not proper in this case!

    Thank you very much again!

  14. cloutak
    Posted January 17, 2011 at 4:27 pm | Permalink

    Sir can you please help me,im gettiing a lot of errors when compiling the different parts from the sudoku grabber explanation of program.
    can you please upload the complete version in a zip or a rar file and post it here ?

  15. Posted February 15, 2011 at 11:59 pm | Permalink

    I guess the new version of Google Goggles on Android detects and solves the Sudoku. I guess you can as well add that feature :p. I love the work that you are doing by sharing loads of stuff. I am working on computer vision problems, so I find your stuff very relevant to my area of research. Good job!

    • Posted February 16, 2011 at 7:47 am | Permalink

      Glad you liked the site! Though I think the Google people copied the sudoku thing from here :P

  16. René
    Posted March 4, 2011 at 9:20 am | Permalink

    Thanks a lot, I’ve learned so much from your articles, I followed carefully every explanation, and thinks worked (was much better than I was expecting for).
    http://www.cec.uchile.cl/~rene.tapia/images/runningProgram.png
    Again, you did a very good work, congratulations.

  17. Posted March 4, 2011 at 7:27 pm | Permalink

    Excellent tutorial, thanks. Many of these techniques would also to apply to my problem domain, which is scanning barcode images. Although I can think of many more interesting problems to solve this way!

    Thanks for the great tutorial.

  18. arkiazm
    Posted March 7, 2011 at 10:09 pm | Permalink

    hi Utkarsh
    urs is really a nice blog… with excellent tutorials… thanks a lot…
    i got one doubt regarding finding biggest blob and floodfill…
    i am using python and opencv… not c++..
    it takes a lot of time for finding biggest blog…
    any way to reduce it?
    thanks in advance

    • Posted March 11, 2011 at 4:10 pm | Permalink

      Try this – find contours in the thresholded image. The contour with the largest bounding box is the biggest box. I’m guessing this will be a lot faster. Let me know if this works!

      • Linus
        Posted April 23, 2011 at 3:42 am | Permalink

        Would you mind writing a short example using cvFindContours?

        • Posted April 29, 2011 at 9:10 pm | Permalink

          Already did. Search for “introduction to contours” on AI Shack

  19. Richárd Szabó
    Posted April 13, 2011 at 7:52 pm | Permalink

    It would be a good idea to make your final source code available.

    Anyway it is an excellent tutorial.

    • Posted April 29, 2011 at 9:23 pm | Permalink

      I have that on my todo list. I’ve just been procrastinating too much lately. Should be up in sometime though :P

  20. Richárd Szabó
    Posted April 14, 2011 at 5:21 pm | Permalink

    “Then we convert it into IplImage to use cvSum. (For some reason, there is no C++ version for this function on my system)”

    There is a sum function in C++ version: http://opencv.willowgarage.com/documentation/cpp/core_operations_on_arrays.html?highlight=sum#sum

    • Posted April 29, 2011 at 9:27 pm | Permalink

      Perfect – you can use this. While writing this tutorial, I couldn’t find this. Thanks for pointing this out!

  21. Palmendieb
    Posted April 18, 2011 at 1:49 am | Permalink

    Hi Utkarsh

    First off all, I just need to gratulate to this grade Blogs. I just tried yout tutorial for sudoku recognisation and solving. I have some trouble withe that. I use OpenCV 2.1.0 and it seems there are some problems with the header inclusions and the cvcore source? So could you please have a log for that?

    Thanks for that…

    regards

    Palmendieb

    • Posted April 29, 2011 at 9:30 pm | Permalink

      Well, look at the OpenCV site. I think they have a setup procedure. Use that and the installation procedure on AI Shack. You should be able to get it to work.

  22. Donny
    Posted April 26, 2011 at 7:59 pm | Permalink

    hi.thank you for your article. currently i’m working on pattern recognition as a newbie.this tutorial does help me in understand it.thanks.
    great work you have there.

  23. mala
    Posted May 30, 2011 at 3:53 pm | Permalink

    really great tutorial!
    ..but would you please make the source code available here??

  24. Hadi
    Posted June 1, 2011 at 11:35 pm | Permalink

    Hi
    Nice tutorial !! i really enjoyed

  25. Nguyen Xuan Hoc
    Posted June 25, 2011 at 6:47 am | Permalink

    Hi, the first, thanks a lot. I find very helpful from your website.
    I’m a new meb in openCV . I have some question about this article.

    1,What is the line contain?

    2,how you can calculate m and c?
    float m = -1/tan(line[1]);
    loat c = line[0]/sin(line[1]);

    3, what is the relation of line[0] and line[1]

    Thanks in advance!

    • Posted June 25, 2011 at 9:46 pm | Permalink

      line[0] is \rho and line[1] is \theta. \rho and \theta describe a line (that was detected). Does it make sense now?

      • Nguyen Xuan Hoc
        Posted June 27, 2011 at 10:48 am | Permalink

        Thanks you for your support!

  26. Sai
    Posted July 3, 2011 at 12:25 pm | Permalink

    How about setting the border of the Sudoku puzzle via ROI technique? What is the disadvantage compared to this method?

    • Posted July 4, 2011 at 12:39 am | Permalink

      What do you mean the ROI technique? From what I guess, you’re making a “mask” and using it again and again to “select” cells from the grid. That way, you’d have to do math because the cells aren’t in perfect shape.

  27. Sai
    Posted July 5, 2011 at 1:08 am | Permalink

    Instead of finding a largest blob, and then finding the individual grid lines, why not set a region of interest and then find the individual grid lines?

    • Posted July 13, 2011 at 9:09 am | Permalink

      How would you set the ROI? How do you figure out the coordinates of the grid?

  28. kasun
    Posted July 5, 2011 at 8:29 am | Permalink

    Hi,
    I red many of u’r tutorials and you are doing a grate job. Thankz for helping us through this blog. Do u know how to measure the orientation of an object from opencv. As an example, assume that there is a black colour mark is on the robot top surface. so i need to measure the angle of the robot. Do u have a method or a sample code for this??
    thank you

  29. Franz Wong
    Posted August 8, 2011 at 2:08 pm | Permalink

    Nice article. As a beginner, Sudoku is a good example for me to learn computer vision.

    I changed the logic a little bit. I used cvFindContour to get the points of largest blob. And then found out the corner points which were nearest the max/min of x and y of the points. :)

  30. Gabor
    Posted July 29, 2011 at 4:56 pm | Permalink

    Hi,

    I fixed it, just switched back from openCV 2.2 to openCV 2.0, and that was it.
    Matrix declaration itself did not want to work, by “Mat example;”, i got an error in mat.cpp, which i couldnt debug.
    But now it works! :-)

One Trackback

  1. By How I’ll be wasting my time… on March 14, 2012 at 4:42 pm

    [...] quick search reveals all manner of interesting projects and sample code hosted online ranging from visual Sudoku grid ‘grabbing’ to creative uses of augmented reality that make use of the device’s other forms of input such as [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">