Statements (65)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:gymnasium
|
gptkbp:action |
Discrete
Move left or right |
gptkbp:action_space_size |
2
|
gptkbp:agent |
Iterative process
RL agent |
gptkbp:analyzes |
Graphical representation of the cart and pole
|
gptkbp:available_in |
Open AI Gym library
|
gptkbp:common_algorithms |
PPO, A3 C
|
gptkbp:community_support |
Strong community support
|
gptkbp:created_by |
gptkb:Open_AI
|
gptkbp:description |
A reinforcement learning environment where a pole is balanced on a cart.
|
gptkbp:difficulty |
Easy to moderate
|
gptkbp:difficulty_levels |
Beginner-friendly
|
gptkbp:discount_factor |
Gamma value used in RL
|
gptkbp:environment |
Classic control
|
gptkbp:environment_features |
Simple dynamics
|
gptkbp:environment_id |
gptkb:Cart_Pole-v2
|
gptkbp:environment_reset |
Randomized initial conditions
|
gptkbp:environment_type |
gptkb:Control
|
gptkbp:environment_updates |
Regularly maintained
|
gptkbp:evaluates |
After every few episodes
Average reward per episode |
gptkbp:first_released |
gptkb:2016
|
gptkbp:goal |
Keep the pole balanced for as long as possible
|
gptkbp:gym_version |
0.21.0
|
gptkbp:has_function |
Resets the environment to an initial state
|
https://www.w3.org/2000/01/rdf-schema#label |
Cart Pole-v2
|
gptkbp:hyperparameter_tuning |
Common practice
|
gptkbp:initial_state |
Randomly generated
|
gptkbp:input_output |
Continuous and discrete
Discrete action selection |
gptkbp:is_explored_in |
Epsilon-greedy
|
gptkbp:is_implemented_in |
gptkb:Python
|
gptkbp:is_popular_in |
Academic research
|
gptkbp:is_taught_in |
Varies by algorithm
|
gptkbp:latest_version |
v2
|
gptkbp:learning_algorithms_used |
gptkb:DQN
|
gptkbp:library |
Tensor Flow, Py Torch
|
gptkbp:max_episode_steps |
gptkb:500
|
gptkbp:observation_space |
Continuous
|
gptkbp:performance |
Cumulative reward
|
gptkbp:policy_type |
Stochastic or deterministic
|
gptkbp:prize_pool |
1 for every timestep the pole remains upright
|
gptkbp:real_world_applications |
Robotics, control systems
|
gptkbp:related_to |
Control theory
|
gptkbp:render_method |
Visualizes the environment
|
gptkbp:requires |
Gym library
|
gptkbp:reward_shaping |
Not typically used
|
gptkbp:seed_method |
Sets the random seed for reproducibility
|
gptkbp:simulation_speed |
Real-time simulation
|
gptkbp:state |
Cart position, cart velocity, pole angle, pole velocity at tip
|
gptkbp:state_normalization |
Not required
|
gptkbp:state_space |
4-dimensional vector
|
gptkbp:state_space_size |
gptkb:4
|
gptkbp:step_method |
Takes an action and returns the next state, reward, done, and info
|
gptkbp:success_rate |
Varies by algorithm
|
gptkbp:termination_condition |
Pole angle exceeds 15 degrees or cart position exceeds 2.4
|
gptkbp:training |
Varies by algorithm
|
gptkbp:tutorials |
Many online tutorials
|
gptkbp:used_for |
Benchmarking RL algorithms
|
gptkbp:used_in |
Reinforcement Learning research
|
gptkbp:user_base |
Researchers, students, hobbyists
|
gptkbp:bfsParent |
gptkb:Cart_Pole-v1
|
gptkbp:bfsLayer |
5
|