I'm interested in using RGBDSLAM and OctoMaps with the Kinect camera to create 3D maps. Using generated 3D maps and combining it with object recognition code I've written using OpenCV, I would like to create semantic maps of the environment. For clarity, when I say semantic maps, I just mean having maps are able to highlight or pinpoint recognized objects in the environment.
I hope to do this in 'real-time' - it would be nice if the final product was able to update at a rate of at least 5Hz on my dual-core laptop.
I'm new to ROS/ RGBDSLAM/ OctoMaps so I was wondering whether anyone can provide feedback or suggestions? Is there anything I should watch out for? Are there any easier ways of going about this? Is this in fact possible!?
As far as I can tell from my research, this is the best way to go about this.
I'm also open to suggestion for 2D maps as well, although would prefer 3D at this stage.
Regarding the integration of semantic information, here's my thoughts:
The easiest way (maybe not the most efficient though) to do this, is should be to have your recognition software subscribe to the rgb image and point-cloud, do recognition, recolor the point cloud, e.g. red points for class 1, green for class 2, black points for unclassified. then send the rgb image and the recolored cloud to the topics rgbdslam listens to (you can modify them via parameters). Then the point cloud colors will be integrated into the map and can be seen as labels. To make things efficient, you would integrate your software into rgbdslam and call it from the callback methods in openni_listener.cpp (e.g. kinect_callback, noCloudCallback).
A cleaner way is to use a point cloud type that contains semantic information. Therefore you would need to redefine the point cloud type in rgbdslam (src/parameter_server.cpp) to one that contains your semantic information. Note that, if you omit the color (i.e. the point.rgb field), there will be errors in compiling glviewer.cpp. These should be easily solvable though.
Then you need to adapt the octomap server to use a voxel leaf that stores semantic information. This has been done, but I don't know how.
5hz on a dual core laptop might be hard to get. You'll need a GPU, and use SIFTGPU features otherwise detection & extraction of SURF will slow you down too much (~2hz). ORB might be an alternative, but I found them to be less accurate. With two cores, you will definitely need to reduce loop closure search (the ..._candidates settings, see parameter_server.cpp).
Setting the openni camera driver to QVGA and lower FPS (use dynamic reconfigure) will reduce the cpu load of the driver.
Producing Octomaps from online-mapping consumes a lot of RAM, because you need to store the clouds. Reducing resolution will also help here.
Using Multiresolution Surfel Maps for RGB-D-SLAM See code.google.com/p/mrsmap and Random Forests for Object-class segmentation of individual views, Nenad Biresev fused this semantic information to create 3D semantic maps.
See: Jörg Stückler, Nenad Biresev, and Sven Behnke: Semantic Mapping Using Object-Class Segmentation of RGB-D Images In Proceedings of IEEE/RSJ International Conference on Robots and Systems (IROS), Vilamoura, Portugal, October 2012. See www.ais.uni-bonn.de/papers/IROS_2012_Semantic_Mapping.pdf
Asked: 2012-07-01 07:52:17 -0500
Seen: 497 times
Last updated: Jan 26