Ask Your Question
0

Are my findings about tod_detecting correct?

asked 2011-03-25 06:22:51 -0600

Julius gravatar image

Step 1: Feature detection and extraction -- Features are detected and extracted from the ROI in the 2d gray-scale test image. There is one Features2d instance produced which contains keypoints and the corresponding generated feature descriptors. This stage does not use pose information.

Step 2: Keypoint matching -- For each object and view in the training base, and for each keypoint descriptor in the test image, this stage finds the k best matches between keypoint descriptors in the test image and the feature descriptors of the training images. This stage does not use pose information.

Step 3: Clustering matches -- For each training object, all best matching keypoints that have been found in the previous stage are clustered based on the distance between test image keypoints. This stage does not use pose information.

Step 4: Guess generation -- To each match we can find a corresponding 3d point recorded in Features3d instances. Guesses are made based on the decision whether we can find a good projection (i.e. many inliers) of those 3d points (coming from different views of the same object) to the camera plane, such that the projection error in respect to the 2d test image keypoints is minimized.

The pose estimation obtained in the training phase is absolutely crucial to bring these 3d points into the same coordinate system (see GuessGenerator.cpp, line 267 at time of writing).

(Using object_detection in SVN revision 50425).

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2011-06-09 12:15:06 -0600

Vincent Rabaud gravatar image

Yes you are right about all your findings: most of the code is legacy and not commented and that is why we do not support that stack fully. We are refactoring a lot of that to get a cleaner API/structure and interoperability with other pipelines. That should be done by the end of August.

The version which also gets a depth image as an input does not do step 3. We empirically figured out it does not bring much with 3d data.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2011-03-25 06:22:51 -0600

Seen: 182 times

Last updated: Jun 09 '11