There are some downsides to creating your own derived message type, mostly regarding the usage of tools such as rviz or image_view for showing the image will not work on these.
One possibility is to maybe abuse the seq field in the image header for user data such as the robot id. Also the frame id may contain the name of the robot such as /robot1/kinect_optical_frame or such.
Whether you would send the Pose on another topic or via tf, you can always make sure you publish it at the exact same time as the image message has in its stamp and then synchronization should not be hard later.
If you don't care about viewing the images on the fly, from bagfiles or such, I would probably go with a new derived message type as proposed in Bence's answer.
It's actually a pity that composed messages seem not to be transparent when subscribing in ROS. It would be great if you could have a topic "/detections" publishing re_kinect_object_detector/DetectionResult and then a subscriber on "/detections/Image" that would then look for an sensor_msgs/Image. Like this, an image topic in rviz could still view the nested image. rostopic echo does a bit of this magic, like you could say rostopic echo /a_topic_publishing_posestampeds/pose/position/x and get the x-es of all messages displayed.