I am currently working on famous Mountain Car problem from reinforcement learning. This problem

Question

0

Asked: June 4, 20262026-06-04T16:25:22+00:00 2026-06-04T16:25:22+00:00

I am currently working on famous Mountain Car problem from reinforcement learning. This problem

0

I am currently working on famous Mountain Car problem from reinforcement learning. This problem is of continuous nature, meaning I have two variables: one position – ranging from -1.2 to 0.5 and velocity – ranging from -0.07 to 0.07. And I have 3 possible actions – reverse acceleration, forward acceleration and neutral, actions result in changing position in appropriate direction. Because of how acceleration is calculated my position variable is continuous, meaning that I can’t use a lookup table, so I tried to divide position-velocity axis in rectangular sectors, dividing position into buckets of width 0.05 and velocity into buckets of 0.005 length, assigning each sector an index, I did it like this:

public int discretiseObservation(Observation observation) {
    double position = observation.getDouble(0) ;
    double velocity = observation.getDouble(1);

    boolean positionNegativeFlag = position < 0;
    boolean velocityNegativeFlag = velocity < 0;

    double absolutePosition = Math.abs(position);
    double absoluteVelocity = Math.abs(velocity);

    double discretePosition = Math.floor(absolutePosition / 0.05);
    double discreteVelocity = Math.floor(absoluteVelocity / 0.005);

    if(velocityNegativeFlag) {
        discreteVelocity += 14;
    }

    if(positionNegativeFlag) {
        discretePosition += 10;
    }

    return (int)discretePosition * 28 + (int)discreteVelocity;
}

But this scheme results in some sectors having the same index number. Do you have any idea how can I discretize this two continuous variables?

Upd: Sorry forgot to mention that when position or velocity exceeds maximum or minimum value I set it back to maximum or minimum value

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T16:25:24+00:00

You are overly complicating things a bit with all those sign checks. Also, you should avoid using magic constants — give them meaningful names. The discretization code should look like this:

double normalize(double value, double min, double max) {
    return (value - min) / (max - min);
}

int clamp(int value, int min, int max) {
    if (value < min) value = min;
    if (value > max) value = max;
    return value;
}

int discretize(double value, double min, double max, int binCount) {
    int discreteValue = (int) (binCount * normalize(value, min, max));
    return clamp(discreteValue, 0, binCount - 1);
}

public int discretizeObservation(Observation observation ) {
    int position = discretize(observation.getDouble(0), minPosition, maxPosition, positionBinCount);
    int velocity = discretize(observation.getDouble(1), minVelocity, maxVelocity, velocityBinCount);
    return position * velocityBinCount + velocity;
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently working on famous Mountain Car problem from reinforcement learning. This problem

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply