Ask Your Question

Revision history [back]

1.) Yes, it would be MUCH better to handle the intrinsic calibration of each camera beforehand with a reliable target. If you only have a few manually measured points, you won't have good coverage in your entire image.

2.) Have you considered making a circles grid by gluing targets (I'm assuming they are bright orange or obviously colored) onto a contrasting background? We know that in order for your application to work, you need to be able to detect the targets at a distance. If you have a good intrinsic calibration for each camera, you don't have to have a very large stereo calibration target. You could arrange 3x3 or 4x4 targets onto a board and move to various positions/distances. This way you get a better calibration through many board positions but fewer board points.

You'll also need to consider the specifics of your application. Ultimately, you need these steps:

  • Good 3D projection model for each individual camera (intrinsic calibration)
  • Calibration of the position and orientations between multiple cameras (extrinsic calibration)
  • Detection of the target in the background scene
  • Optimization between two or more cameras to determine position relative to the camera array.

I don't think that a traditional stereo algorithm will work very well, and if it does, it will be overkill and not easily extensible to multiple cameras.

Likely, you'll want to work in this order:

  • Use camera_calibration to reliably and accurately calibrate each camera.
  • Write a detector for the targets in the original, unrectified images. If you have good targets, this will likely just be a color threshold and finding blob centroids.
  • Perform a calibration by detecting a target grid with multiple cameras. This will give you the transform between each of the cameras. You can use the openCV stereo calibration or the pcl svd transform estimation for this.
  • Now you can work on detecting the 3D position of a single target.
  • Convert the coordinates to the rectified image.
  • Once you know the centroid of the target in rectified coordinates, you can calculate the 3D ray from the camera that goes through the center of the target.
  • Finally, write some sort of optimizer to output a 3D point from the intersection N-camera rays. You'll likely not have perfect intersections, so you'll surely have to converge on something close. You may even be able to determine the distance to the target using the measured size in pixels if you have sufficiently high resolution cameras.