toddhester's profile - activity

overview network karma followed questions activity

2013-03-13 06:13:25 -0500	answered a question	Node Agent texplore "killed", using reinforcement learning pkg Hi, It would help to see what command(s) you used to start the experiment. My guess would be its because you need to provide a discretization for the value function in these continuous domains. You can do that with --nstates x option to discretize each dimension into x discrete states for the value function. On mountain car, I believe --nstates 10 or --nstates 20 work well. You can try this, or you can also try running an experiment on a discrete task such as taxi or tworooms and see how those go. -Todd
2012-11-23 18:59:28 -0500	received badge	● Teacher (source)
2012-08-09 09:53:10 -0500	answered a question	ROS rl (texplore) package Hi, Sorry for the slow reply. I do have methods to model continuous domains in the package, such as using M5 regression trees or linear regression models. However, all of the planning methods that I have implemented for the model-based methods eventually store a discrete value function. So for the planning you would have to discretize. For the UCT planning, the planning rollouts happen in continuous space, using the continuous model. Then the values are updated back to a discretized state. So this planning method is less dependent on a good discretization, i.e. you won't have state aliasing problems if the discretization is too large, as each next state in the rollout is still using the continuous model and state. The discretization is only necessary for UCT(lambda) with lambda < 1 where the updates for a state is bootstrapping to the value of the next state. If you run it with lambda=1, the agent's current state is only updated with the full return of the planning rollout, and the discretization isn't really necessary. So if you want to do planning in a continuous and infinite domain, I would recommend planning with UCT(lambda) and setting lambda = 1. The model based methods take as input the minimum and maximum values for each state feature, which are used to bound the values that are updated, as well as # of discrete values you want to discretize each feature into. If you can put some bounds on the state features (even ones that are wildly too big) and then pass 1 as the nstates parameter of how many discrete values you want, I think it should work. Another alternative is to adapt some of the model-free methods with function approximation (e.g. Q-learning with tile coding) as the planner for the model-based method using the continuous models. Thanks, Todd
2012-08-09 09:52:13 -0500	answered a question	ROS rl (texplore) package Hi, Sorry for the slow reply. I do have methods to model continuous domains in the package, such as using M5 regression trees or linear regression models. However, all of the planning methods that I have implemented for the model-based methods eventually store a discrete value function. So for the planning you would have to discretize. For the UCT planning, the planning rollouts happen in continuous space, using the continuous model. Then the values are updated back to a discretized state. So this planning method is less dependent on a good discretization, i.e. you won't have state aliasing problems if the discretization is too large, as each next state in the rollout is still using the continuous model and state. The discretization is only necessary for UCT(lambda) with lambda < 1 where the updates for a state is bootstrapping to the value of the next state. If you run it with lambda=1, the agent's current state is only updated with the full return of the planning rollout, and the discretization isn't really necessary. So if you want to do planning in a continuous and infinite domain, I would recommend planning with UCT(lambda) and setting lambda = 1. The model based methods take as input the minimum and maximum values for each state feature, which are used to bound the values that are updated, as well as # of discrete values you want to discretize each feature into. If you can put some bounds on the state features (even ones that are wildly too big) and then pass 1 as the nstates parameter of how many discrete values you want, I think it should work. Another alternative is to adapt some of the model-free methods with function approximation (e.g. Q-learning with tile coding) as the planner for the model-based method using the continuous models. Thanks, Todd