Mountain Car 3D

The 3D Mountain Car task extends the standard 2D task. It was originally proposed in [1]. The state is described by four continuous variables, the positions x,y which have ranges of [-1.2, 0.6] and the velocities vx and vy which have ranges of [-0.007, 0.007]. The available actions are {Neutral, West, East, South, North}. The Neutral action has no impact on the velocity of the car. West and East actions add to vx -0.001 and 0.001 respectively while South and East add to vy -0.001 and 0.001 respectively. Additionally, to simulate the effect of gravity a factor of -0.0025*cos(3x) and -0.0025*cos(3y) is added, at each time step, to vx and vy respectively. In the standard task each episode starts with the car at the bottom of the hill and the goal state is reached when x >= 0.5 and y >= 0.5. At each time step the agent receives a reward of -1