Real-time reinforcement learning stack

asked 2019-04-08 07:35:05 -0500

ilinojkovic
1 ●1 ●1 ●1

updated 2019-04-08 08:18:50 -0500

Hi all. I'm in the search of reinforcement learning stack that would enable the control of both real-world and simulated robot. Until now, I was thinking of using ROS for describing the robot (xacro/urdf), then Mujoco for physics simulation and OpenAI gym as a RL library. Now, it came to my attention that this stack is not well suited for real-time robot control, so I'm looking for the suggestions from the community. Concretely, I would like to have a (python) library where I could implement and benchmark RL algorithms independent of underlying control mechanism, meaning I should be able to swap between simulated and real robot without the algos ever knowing the difference. Thanks!

EDIT: A bit of clarification

To my mind, the first reason why this stack is not suitable, is due to python not really being real-time. One of the major concerns is memory management and garbage collection. Because of these two, intervals of executed control actions wouldn't be predictable. Currently described setup uses ROS to create and compile xacro. When I obtain catkin compiled urdf, I compile it with mujoco to mjb. This mjb file is then used in gym+mujoco-py, and the problem is then here that the control is fully in python. Additionally, even if it were real-time, I couldn't find documentation on how to perform real robot control from these libraries. I presume ros_control comes into play here, but I'm not sure how can it be integrated in the Mujoco/Gym setup.

Furthermore, I like python for its expressiveness, easy algorithm experimentation, and great ML/RL support, so I wouldn't run away from it just yet. Also, custom creating everything myself is indeed an option, but I truly hope there's a standardized, or at least an existing setup where reinforcement can be done for both simulations and real robots. All in all, it seems that there are a couple of options, and I'll list them in the order of preference:

There exists a fully python setup that is real-time and can perform control on real robots, but I'm not aware of it. (I'm quite new to the field)
Fast loop needs to be implemented in C++, which would have abstract class for control, and specific classes for both real robot and mujoco simulator. A python code could then communicate to this fast loop.
A fully C++ solution, which would probably mean a lot of custom code.

If I missed anything, misunderstood, or misrepresented, please feel free to correct me, and suggest your ideas.

edit retag flag offensive close merge delete