The University of California, San Diego Data Mining Competition lets you test you data mining skills on a real world dataset.
The goal is to predict if a person is a potential customer. They have two data sets: a raw dataset with has some 300 variables per person and the transformed dataset has some 2400 variables per person. Using all this data about a person, you must predict if he is interested in buying things from you.
The last date for submission of your results is 30th September.
The competition home page: http://mill.ucsd.edu/
In the last post, we had found some lines. But the numerous lines were not good enough for detecting the location of the puzzle. So we’ll do some math today and find out exactly where the puzzle is. We’ll also un-distort the puzzle so we have a perfect top-down view of the sudoku puzzle.
Merging lines
Each physical line on the image has several “mathematical” lines associated with it. This is because of its One way to fix this is to “merge” lines that are close by.

Lines detected by the Hough transform
By merging lines I mean averaging nearby lines. So lines that are within a certain distance will “fuse” together.
We’ll write another function to fuse lines [click to continue…]