Is this the right approach for modelling a simple scene in 3d out of 2 images?

I'm new in this field and I'm trying to model a simple scene in 3d out of 2d images and I dont have any info about cameras. I know that there are 3 options ( ):

• Or I have two images and I know the model of my camera (intrisics) that I loaded from a XML for instance loadXMLFromFile() => stereoRectify() => reprojectImageTo3D()

• Or I don't have them but I can calibrate my camera => stereoCalibrate() => stereoRectify() => reprojectImageTo3D()

• Or I can't calibrate the camera (it is my case, because I don't have the camera that has taken the 2 images, then : I need to find pair keypoints on both images with SURF, SIFT for instance (I can use any blob detector actually), then compute descriptors of these keypoints, then matching keypoints from image right and image left according to their descriptors, and then find the fundamental mat from them. The processing is much harder and would be like this:

1. detect keypoints (SURF, SIFT) =>
2. extract descriptors (SURF,SIFT) =>
3. compare and match descriptors (BruteForce, Flann based approaches) =>
4. find fundamental mat (findFundamentalMat()) from these pairs =>
5. stereoRectifyUncalibrated() =>
6. reprojectImageTo3D()

I'm using the last approach and my questions are:

1) Is it right?

2) if it's ok, I have a doubt about the last step "stereoRectifyUncalibrated() => reprojectImageTo3D()". The signature of reprojectImageTo3D() function is:

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )

cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)


Parameters:

-disparity – Input single-channel 8-bit unsigned, 16-bit signed, 32-bit signed or 32-bit floating-point disparity image. -_3dImage – Output 3-channel floating-point image of the same size as disparity . Each element of _3dImage(x,y) contains 3D coordinates of the point (x,y) computed from the disparity map. -Q – 4x4 perspective transformation matrix that can be obtained with stereoRectify(). handleMissingValues – Indicates, whether the function should handle missing values (i.e. points where the disparity was not computed). If handleMissingValues=true, then pixels with the minimal disparity that corresponds to the outliers (see StereoBM::operator() ) are transformed to 3D points with a very large Z value (currently set to 10000). -ddepth – The optional output array depth. If it is -1, the output image will have CV_32F depth. ddepth can also be set to CV_16S, CV_32S or CV_32F.

How can I get the Q matrix? Is possibile to obtain the Q matrix with F, H1 and H2 or in another way?

3) Is there another way for obtain the xyz coordinates without calibrate the camera?

My code is:

#include <opencv2/core/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <stdio.h>
#include <iostream>
#include <vector>
#include <conio.h>
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/cvaux.h>

using namespace cv;
using namespace std;

int main(int argc, char *argv[]){

// check
if (!imgLeft.data || !imgRight.data ...
edit retag close merge delete

Sort by » oldest newest most voted

Why are you mixing opencv and opencv2 headers? I tried this code with a couple of test images and some did ok and some were bad. Maybe play with the parameters some to get better performance.

Since this is really an OpenCV question, you might want to ask it on their discussion board.

more