Implementing SIFT in OpenCV

Okay, now for the coding. I’m assuming you know how SIFT works (if not, check SIFT: Scale Invariant Feature Transform. It’s a series of posts on the SIFT algorithm). I’ll be using C++ and classes to keep things neat and object oriented.

OpenCV doesn’t come with inbuilt functions for SIFT, so we’ll be creating our own functions. My code here is based on code by Jun Liu. He implemented SIFT with VXL (another vision library like OpenCV).

The Code

You can download the code at the bottom. Download and have a look at it before going through this post..

In the code, I’ve added a lot of comments throughout. With those comments and knowledge of the algorithm, you should be able to understand what the code does pretty easily.

SIFT.h and SIFT.cpp

These two files implement a class: SIFT. Through this class you do everything. Load the image to work on, get descriptors, etc.

The class has a simple interface. You create a new object and call DoSift(). Every step will be done for you! Everything is well explained in the code with comments, so have a look there. It would take another week of posts to explain each line of code, so I’m not doing that. The comments should be enough.


This class just holds a keypoint’s fingerprint. It can store the location of the keypoint and the feature vector.


This class holds the keypoint information: location (x, y), scale, magnitude and orientation.


The small main function in the file uses the SIFT class. It’s just a little demo of how things work.

A learning aid

At several key locations in code, you’ll notice commented code. Something like this:

Uncomment there lines. You’ll get physical files that you can use to examine the scale space, difference of gaussians, extrema, etc. That way you’ll actually get a sense of what’s going on (Again, I’m not going into the details of what there are. I’ve already covered these in the series SIFT: Scale Invariant Feature Transform) I’ll leave it up to you to find these commented blocks (assignment worth 10%?).


Over the last week I’ve talked about how the algorithm works. And finally, implemented the code. So I guess this marks an end for the series! Sometime in the future, we’ll pick up the topic of matching SIFT features in different images.

Anyway, if you’ve got any questions or suggestions about the code, let me know – leave a comment below!

Other articles in the series

  1. SIFT: Scale Invariant Feature Transform
  2. Step 1: Constructing a scale space
  3. Step 2: Laplacian of Gaussian approximation
  4. Step 3: Finding Keypoints
  5. Step 4: Eliminate edges and low contrast regions
  6. Step 5: Assign an orientation to the keypoints
  7. Step 6: Generate SIFT features
  8. Implementing SIFT in OpenCV

Issues? Suggestions? Visit the Github issue tracker for AI Shack

Back to top


  1. Dragos Ciofu
    Posted July 7, 2010 at 2:45 pm | Permalink

    Hi Utkarsh! I’m currently working on a couple of Pioneer robots that should follow one another on the floor. I’ve decided to go along with SIFT for that matter. You were saying earlier in the post that ‘somewhere in the future’ you’re going to continue the SIFT series. Are there any chances you’ve done the mathing of the images too ? Looking forward to reading more. Thanks

    • Posted July 7, 2010 at 3:01 pm | Permalink

      Yes, I’ll definitely be writing about object detection/image matching with SIFT in a week or two. These days I’m busy with some technical aspects of the website (it’s pretty new).

      Why don’t you subscribe via RSS/twitter/email? You’ll get to know the moment it happens :P And can you tell me about your robots in detail? Sounds interesting :)

      Btw, by mathing did you mean matching?

      • Dragos Ciofu
        Posted July 7, 2010 at 4:47 pm | Permalink

        Pardon me, it was a typo. I meant ‘matching’, of course. :)

        About my project: I have an assignment to make two of these to follow one another around the room.

        I already have control over the robots via an interface. They are programmed in C++ under Visual Studio 2008, which I access over a remote desktop connection. So the robots are actually running Windows XP. There is one aspect I’m concerned about: the performance of the Intel M processor on the robots; because on my Core2Duo @2.2GHz determining the keypoints of a 640px image takes 3-4 seconds. If you add into consideration the matching part, a viable result will be obtained in 15secs (my approximation). And that would be way to much for a ‘real time’ tracking robot.

        I saw some videos on youtube in which the posters have managed to do real time tracking of images acquired via webcam ( ). I’ve also managed that, but only using OpenCV’s template matching, which is way faster than SIFT.

        I’ll subscribe to your feed, and I’m looking forward for your posts and/or suggestions.


        • Posted July 7, 2010 at 5:14 pm | Permalink

          Sounds great! Don’t you have any other way of recognizing the other bot? Both template matching and SIFT are slow. Can’t you use the bot’s colour to follow it? And I just added you on Gtalk! You use it?

  2. Abhimanyu Bhargava
    Posted July 9, 2010 at 3:57 pm | Permalink

    Hi Utkarsh,
    Its a very informative ‘Shack’. I had some queries regarding image processing in robotics.-
    1. I am doing all the image processing tasks off-board in my robot. For which i have to send data wirelessly via X-bee bluetooth module. This transfer of image frames from robot to PC is very time consuming.
    Is there a method by which I can make this data transfer faster? Or can I use FPGA ( in addition to ATmega 16 ) to run an image processing code only. (FPGA doesnt have a buffer register or memory)

    • Posted July 9, 2010 at 5:19 pm | Permalink

      Hi Abhimanyu! Glad you liked this shack :P

      That’s one real problem if you don’t have some processor on the robot itself. The best thing you could do is put a mini-computer on the robot itself. Have it powered by intel, run XP/linux and your problem will be solved.

      About making the transfer faster, I’m not sure about that. You’ll probably need expensive hardware for that. Can you tell me about the speed you get with the Xbee module?

      • Abhimanyu Bhargava
        Posted July 9, 2010 at 11:52 pm | Permalink

        The maximum limit in the XBee module is- 115,200 bits per second. But the data in hyperterminal is received only when we select 9600 bits per sec as the baud rate. So 9600 bps is the rate. We cannot have powerful processors on the robots due to power and space constraints.
        In some cases camera directly transfers the images via Firewire, but that would be a wired connection. :(
        Do you know if there is any alternative to firewire?

  3. Ryan
    Posted September 11, 2010 at 9:32 pm | Permalink

    hi Utkarsh.

    what do you say about SIFT performance compared to SURF or PCA one? i am currently doing my research in robotics vision which needs a real time computer vision system. currently i’m using SURF because the library availability in openCV which helps a lot. but i found SURF was not too good on descriptors since i found it could not detect an object whether it was captured in the camera or not.

    • Posted September 20, 2010 at 1:33 am | Permalink

      Hi Ryan! SIFT is very robust. If you check paper that describes SIFT (by Lowe), has some examples of partly hidden objects. The algorithm successfully detects them!

      I think SURF is a good algorithm too… works faster and is pretty robust. What kind of objects are you trying to detect? Does the image quality vary a lot?

  4. jhon
    Posted September 16, 2010 at 12:35 pm | Permalink

    Hi i am trying to implement SIFT in Matlab but having problem with getting images saved for keypoint detection.Infaact i am using loops to create the DOG and so far as the final images are obtained i want to have the images to be distinctly recognized for further processing.Any hint to solve the issue?

    • Posted September 20, 2010 at 12:35 am | Permalink

      Hi! I didn’t understand what the problem was… what do you mean by having the image distinctly recognized?

  5. Uni
    Posted October 14, 2010 at 1:22 am | Permalink

    Hey.. !

    Thanks for the tutorial. I followed it, and tried to run the program (VC++ 2008 + OpenCV 1.1), but it gives me a ‘fatal’ LINK101 error, and says ”cv200d.lib” file not found. I linked everything and all includes as well, but I’m still getting this error. Weird part, I was able to run the previous tutorialed programs in your posts.

    Now what? :S

    • Posted October 17, 2010 at 10:51 am | Permalink

      *200d.lib files are for OpenCV 2.0. Try cv.lib cxcore.lib, etc to get it to work with OpenCV 1.1

      • Uni
        Posted October 18, 2010 at 2:08 am | Permalink

        I did.. and I don’t know why it’s asking for cv200.lib file. It’s completely weird.

  6. Ravaka R.
    Posted October 14, 2010 at 9:57 pm | Permalink

    I cannot wait to see the tutorial about matching SIFT features in different images :)

    Thanks for your hard work.

    • Posted October 17, 2010 at 10:53 am | Permalink

      The SIFT matching features post will come soon!

  7. Posted October 25, 2010 at 3:45 pm | Permalink

    hi utkarsh,
    very interesting tutorial, i love it. i was going through your opencv code and i realized when you check for keypoints and eliminate corners, the determinant of hessian is less that 0 and your ratio is less than a threshold. shouldnt it be DetH>0 and ratio< contrast_thres instead of detH contrast_thres ???

    good job overall
    all the best

    Posted October 28, 2010 at 9:32 pm | Permalink

    hi Utkarsh,
    I am a little confused about the parameter “intervals”

    What’s the meaning of that parameter ?

  9. Xi
    Posted November 6, 2010 at 1:05 pm | Permalink

    Hi Utkarsh, I look at your code for SIFT, and also the research paper from Mr.Lowe. In this paper, he mentioned, if want to find accurate location for feature point, it is recommend to use taylor expansion, I am sure you also notice that. But it seems like in your code you are still following the method of Lowe’s 1999 SIFT paper. When I reading the paper I also confused about solving their equation, do you get some idea?

    • Posted November 12, 2010 at 1:12 am | Permalink

      Yes, the paper does use taylor expansion. Adding that would yield better accuracy. But it would slow down things too, so I didn’t add that.

  10. Andrey
    Posted November 9, 2010 at 3:57 pm | Permalink

    else if(k==NUM_BINS-1)
    y1 = hist_orient[NUM_BINS-1]
    y3 = hist_orient[0];

    Need y1 = hist_orient[NUM_BINS-2];

  11. Xi
    Posted November 17, 2010 at 5:39 pm | Permalink

    Hi Utkarsh
    I am trying to use your code to do object matching, but it seems like the descriptor your code generate has some little problems.
    I look at your code again and again, something confused me is that in your ExtractKeypointDescriptors() function it seems that you did not rotate the coordinate of 16*16 window corresponding to keypoint’s orientation, you just subtract the main_orien from each pixel in that window. I am not sure whether I miss some part of your code. Just a friendly remainder.


  12. Richard
    Posted January 11, 2011 at 8:36 am | Permalink

    I can not wait to see the tutorial about matching SIFT features in different images. +1

  13. jayasanker m k
    Posted January 18, 2011 at 4:40 pm | Permalink


    I have downloaded your sift implementation in opencv.I am a final year cse student doing my final year project.”railway track maintenance system”.so i want to load the actual video and the false video afterwards in data base to comapare these two videos for identifying the missing couplings on the rail track by locating the couplings in the original video and want to make a count of it,want to make the rgb value detection too.Finally i want to focus on the coupling positions in the two videos and want to check for missing couplings and the old couplings to i thing sift algorithm in opencv will help me in doing this.this is the matter of my final year project please give me support through my mail id provided.

    • Posted January 25, 2011 at 5:27 am | Permalink

      You posted the same question on stackoverflow too.. right? ;)

  14. Richard
    Posted February 7, 2011 at 7:36 pm | Permalink


    I had memory leak problem,when I using the webcam.

    Which side can’t have to release it?

  15. Sándor
    Posted February 8, 2011 at 5:04 pm | Permalink


    Thanks for your SIFT tutorial, it is very helpful. I have some questions for you about the source code.
    1. What happens if the cvSmooth(src,dst,…) (line 172) have param1=0 and param2=0 and the sigma=0.5? (You know the param1xparam2 is the kernel size)
    The OpenCV API said that if the param1=param2=0 then the kernel size would be counted from this:
    sigma = (n/2 – 1)*0.3 + 0.8 and if you use sigma=0.5 then you will get 0 for kernel size.

    2. My new question is about the GetKernelSize() function. You use it in the 495 line with 1 paramter and you implement it in the line 947 with 2 parameters.

    I want to make this in java, too that’s why I’m asking.

    Thanks for your reply.

  16. balaji
    Posted February 28, 2011 at 8:36 pm | Permalink


    A gr8 job in explaining SIFT. Thanks a lot for that.

    I have a small issue with using open CV on visual studio 2008
    when i run ur code i get this:

    Unsupported formats or combination of formats in function cvConvertImage,
    PS: I am a noob in using openCV so i need ur help.

    • Posted March 4, 2011 at 9:48 am | Permalink

      What type of images are the the parameters of cvConvertImage? Check if they’re allowed. Looks like you’re using some formats not supported by that function.

  17. Ramadass
    Posted March 3, 2011 at 11:57 pm | Permalink

    Hi Utkarsh
    Thank you for the great tutorial. I have some doubts in your implementation. I used your code to extract keypoints for an image. It seems like most of the 128 descriptors have zero values. Can you explain, why?

  18. Merwan
    Posted March 14, 2011 at 9:17 pm | Permalink

    hey Utkarsh
    First i want to think you for explaining the SIFT algorithme, i want just to ask if you did implement this algorithme using Matlab or VHDL, ???

  19. sriram
    Posted March 16, 2011 at 4:15 pm | Permalink

    .. great information and codes. Thank you.

  20. Heng
    Posted March 24, 2011 at 8:16 am | Permalink

    So excellent!
    U did a wonderful job!
    Thk you!

  21. Robert
    Posted March 26, 2011 at 5:53 pm | Permalink

    Hey Utkarsh

    Thanks for doing all the hard work, im having a problem compiling your code.
    im using vs2008 and opencv2.1, the problem is with the cvGetSize function in the buildscale space function, the line
    IplImage* imgGray = cvCreateImage(cvGetSize(m_srcImage), IPL_DEPTH_32F , 1);

    gives me the error “Array should be CvMat or IplImage” in the console

    and this error window-
    Unhandled exception at 0x75a69617 in MySIFT.exe: Microsoft C++ exception: cv::Exception at memory location 0x002bfbe8..

    if you know what the problem is then can you please help… i want to use your algorithm on my robotics uni project.


  22. phie
    Posted March 29, 2011 at 2:36 pm | Permalink

    I am having problem on running the application.
    I can compile it,but when I run it there is error ”The application was unable to start correctly (0xc0150002).
    I suspect this from OpenCV.
    Do you know how to solve this?
    Thank you.

    • Posted April 29, 2011 at 9:45 pm | Permalink

      Most probably the .lib files and the .dll files do not match. Compile OpenCV again and use only the generated .lib and .dll files.

  23. beinshuai
    Posted April 12, 2011 at 8:24 pm | Permalink

    you’ve done a great job ! However I presume there are some problems to be detailed or discussed in the “extracting feature vector”step since there is no verification . There appeared a IplImage* imgTemp aiming to get the in-between pixel , however you do not use this at all , have you made a mistake here or I understand it wrongly.
    And can you explain the following sentence: unsigned int ii = (unsigned int)(kpxi*2) / (unsigned int)(pow(2.0, (double)scale/m_numIntervals)); I think you should write it to “unsigned int ii = (unsigned int)((kpxi*2) / (pow(2.0, (double)scale/m_numIntervals)));” meanwhile , Can you explain your idea about the parabola estimation in the “assigning feature orientation” step.
    Thank you!

  24. Ahmed
    Posted April 13, 2011 at 11:00 am | Permalink

    Hi Utkarsh;
    Thanks for your tutorial!
    Please, Do you know about technique that will be faster than SIFT or template matching in image processing to detect object?
    I’m now using colour segmentation with my robot but my sopervisor asks me to do this by differnt way so I think about template matching or SIFT but that will be time consuming.
    I implement all my functions in robot board so I need somethig Faster and low computational. with my regards

    • beinshuai
      Posted April 13, 2011 at 4:53 pm | Permalink

      You can try SURF — A Fast implemention resembles SIFT.

    Posted April 23, 2011 at 1:15 am | Permalink

    can i have SIFT implementation in java ? :D

    • Posted April 29, 2011 at 9:11 pm | Permalink

      Haven’t written one yet. But if you do, send it to me as well!

  26. imeht
    Posted May 5, 2011 at 10:00 am | Permalink

    Hi Utkarsh,
    I trued running the source code provided by you. It gives me Linker errors of the following type:
    error LNK2019: unresolved external symbol _cvSmooth referenced in function “private: void __thiscall SIFT::BuildScaleSpace(void)” (?BuildScaleSpace@SIFT@@AAEXXZ)
    Can you help out? Am using OpenCV 2.2

  27. Chieh Lee
    Posted May 17, 2011 at 7:45 am | Permalink

    Hi Utkarsh,

    Thank you for the lecture, it really helped me a lot.
    However, I have a problem running your code. it is trying to load cv200d.lib, but my current version of openCV is 2.1 and the lib file is cv210d.lib instead. Which part of the code should I modify so it would load cv210d.lib instead?

    • Posted June 17, 2011 at 5:16 pm | Permalink

      You need to change the linker properties. Check your Project Settings for this.

  28. Rosalia
    Posted May 21, 2011 at 12:13 am | Permalink

    imeht: Try linking to opencv_imgproc220d.lib

  29. Enthusiast
    Posted May 24, 2011 at 12:18 pm | Permalink

    Hi Utkarsh,
    Ur effort in keeping this blog should be appreciated.
    But I am seeing that you dont give any answers to a lot of fundoo questions. Having provided such a beautifully commented code, if you dont provide answers to these queries, your efforts will almost go waste right?
    Enthusiastic engineer

    • Posted June 17, 2011 at 5:17 pm | Permalink

      I try my best to answer queries. If I’m not sure of something, I leave the comments unanswered. Maybe someone who knows an answer will reply.

  30. Posted May 30, 2011 at 9:06 pm | Permalink

    Hi Utkarsh,

    This is great, I learn best when theory and practical implementation is mixed together, and after researching SIFT for a while I think your site explains everything the best.

    I am was looking over the code, and one thing caught my eye. In SIFT.cpp line 574

    else if(k==NUM_BINS-1)
    y1 = hist_orient[NUM_BINS-1];
    y3 = hist_orient[0];

    is y1 supposed to equal hist_orient[NUM_BINS - 2] ??


    • Posted June 17, 2011 at 5:21 pm | Permalink

      I think you’re right. Must have been a typo. I’ll fix it.

  31. sotiraw
    Posted June 3, 2011 at 8:14 pm | Permalink

    great code
    can you tell me how can i make grid sift? this means that i will tell the program in wich spots i want to calculate in them the features

    • Posted June 17, 2011 at 5:19 pm | Permalink

      No idea what grid sift is. Any references/articles about it?

      • sotiraw
        Posted June 24, 2011 at 3:52 pm | Permalink

        grid sift: this means that i will tell the program in which spots i want to calculate in the features.

        just like the vlfeat –read-keypoints optios
        you give a file with 100.2 120.2 1.7 ,they are x y and scale na it computes orientation and the descriptor.
        so the result is 100.2 120.2 1. 7 1.2 5 6 7 0 25 ………

        so sift doesn calculate the ”salient points” but the user gives the points he likes and it only computes the orientation and descriptor. can you change your code to do that?
        this feature is also
        included last opencv2.2 sift and older surf. but i cant make opencvs sift or surf work because i cant understand how to use the comands.

        i think you shoul write a tutorial about this too. they tell you these is the class. but not a simple example of use.

  32. Daniel
    Posted June 10, 2011 at 7:00 am | Permalink

    Hi. I’m trying to compile this in linux. Unfortunately…. Mono Develop won’t run it.

    And compiling via g++ results in…..

    stdafx.h:11: fatal error: tchar.h: No such file or directory
    compilation terminated.

    A number of times. Any advice on how i’d go about getting it to run?


    • Posted June 10, 2011 at 10:40 am | Permalink

      Just remove it. It’s a precompiled header thing used by visual studio.

    • Sundar
      Posted July 24, 2011 at 9:13 pm | Permalink

      Hi Utkarsh,

      I’m trying to implement 2D chamfer matching. However I am not getting a proper pseudo code or algorithm that is easier to implement. Kindly help.

  33. fly2pick
    Posted June 14, 2011 at 11:33 pm | Permalink

    hi Utkarsh
    I’m trying to compile this in vista & opencv2.1 , Unfortunately it was not compile
    It gives me fatal error C1083: Cannot open precompiled header file: ‘Debug\MySIFT.pch
    please help me i add lib of opencv2.1 too.

    • Posted June 17, 2011 at 5:18 pm | Permalink

      Change your project settings. Make it so it doesn’t require a precompiled header.

  34. Salim
    Posted July 6, 2011 at 6:57 am | Permalink

    Hi Utkarsh,

    Thank you for your website. You are really so smart.
    I am doing some research about action recognition. I am a beginner in OpenCV library but I know other languages like C, C++, Visual Basic, etc.
    I would like to use your SIFT code for tracking some specific points during the video stream for example (eyes, nose, etc). Do you think SIFT would help me in eyes tracking in a video?

    Thank you so much for everything.

    • Posted August 9, 2011 at 8:01 pm | Permalink

      For eyes, I think HAAR would be better. You can probably find code for it on the internet somewhere.

  35. Mahshid
    Posted July 15, 2011 at 2:54 am | Permalink

    Thanks Utkarsh. I found your tutorials very useful.

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>