Hi,

Sorry for the slow reply. I do have methods to model continuous domains in the package, such as using M5 regression trees or linear regression models. However, all of the planning methods that I have implemented for the model-based methods eventually store a discrete value function. So for the planning you would have to discretize.

For the UCT planning, the planning rollouts happen in continuous space, using the continuous model. Then the values are updated back to a discretized state. So this planning method is less dependent on a good discretization, i.e. you won't have state aliasing problems if the discretization is too large, as each next state in the rollout is still using the continuous model and state. The discretization is only necessary for UCT(lambda) with lambda < 1 where the updates for a state is bootstrapping to the value of the next state. If you run it with lambda=1, the agent's current state is only updated with the full return of the planning rollout, and the discretization isn't really necessary.

So if you want to do planning in a continuous and infinite domain, I would recommend planning with UCT(lambda) and setting lambda = 1. The model based methods take as input the minimum and maximum values for each state feature, which are used to bound the values that are updated, as well as # of discrete values you want to discretize each feature into. If you can put some bounds on the state features (even ones that are wildly too big) and then pass 1 as the nstates parameter of how many discrete values you want, I think it should work.

Another alternative is to adapt some of the model-free methods with function approximation (e.g. Q-learning with tile coding) as the planner for the model-based method using the continuous models.

Thanks,
Todd