I am currently working on famous Mountain Car problem from reinforcement learning. This problem is of continuous nature, meaning I have two variables: one position – ranging from -1.2 to 0.5 and velocity – ranging from -0.07 to 0.07. And I have 3 possible actions – reverse acceleration, forward acceleration and neutral, actions result in changing position in appropriate direction. Because of how acceleration is calculated my position variable is continuous, meaning that I can’t use a lookup table, so I tried to divide position-velocity axis in rectangular sectors, dividing position into buckets of width 0.05 and velocity into buckets of 0.005 length, assigning each sector an index, I did it like this:
public int discretiseObservation(Observation observation) {
double position = observation.getDouble(0) ;
double velocity = observation.getDouble(1);
boolean positionNegativeFlag = position < 0;
boolean velocityNegativeFlag = velocity < 0;
double absolutePosition = Math.abs(position);
double absoluteVelocity = Math.abs(velocity);
double discretePosition = Math.floor(absolutePosition / 0.05);
double discreteVelocity = Math.floor(absoluteVelocity / 0.005);
if(velocityNegativeFlag) {
discreteVelocity += 14;
}
if(positionNegativeFlag) {
discretePosition += 10;
}
return (int)discretePosition * 28 + (int)discreteVelocity;
}
But this scheme results in some sectors having the same index number. Do you have any idea how can I discretize this two continuous variables?
Upd: Sorry forgot to mention that when position or velocity exceeds maximum or minimum value I set it back to maximum or minimum value
You are overly complicating things a bit with all those sign checks. Also, you should avoid using magic constants — give them meaningful names. The discretization code should look like this: