gptkbp:instance_of
|
gptkb:gymnasium
|
gptkbp:accessibility
|
Open-source.
|
gptkbp:action
|
Discrete
Based on policy.
Discrete actions.
|
gptkbp:action_space_size
|
2.
|
gptkbp:agent
|
Varies.
Periodic.
Iterative.
Policy Gradient.
|
gptkbp:agent_adaptability
|
High.
|
gptkbp:agent_deployment
|
Easy.
|
gptkbp:agent_exploration
|
Critical.
|
gptkbp:agent_learning_curve
|
Steep.
|
gptkbp:agent_optimization
|
Necessary.
|
gptkbp:agent_performance_metrics
|
Reward.
|
gptkbp:agent_testing
|
Essential.
|
gptkbp:agent_training_duration
|
Varies.
|
gptkbp:agent_types_used
|
Various.
|
gptkbp:analyzes
|
Available.
|
gptkbp:application
|
Yes.
|
gptkbp:available_in
|
gptkb:gymnasium
|
gptkbp:can
|
Available.
|
gptkbp:cart_position
|
Continuous.
|
gptkbp:cart_velocity
|
Continuous.
|
gptkbp:code
|
Available on Git Hub.
|
gptkbp:community_support
|
Strong.
|
gptkbp:contribution
|
Active.
|
gptkbp:created_by
|
gptkb:Open_AI
|
gptkbp:dependency
|
Num Py.
|
gptkbp:description
|
A classic control problem where a pole is balanced on a cart.
|
gptkbp:difficulty
|
Easy
|
gptkbp:difficulty_levels
|
Beginner.
|
gptkbp:discount_factor
|
Varies.
|
gptkbp:educational_use
|
Yes.
|
gptkbp:environment
|
Simple.
|
gptkbp:environment_challenges
|
Balancing.
|
gptkbp:environment_complexity
|
Low.
|
gptkbp:environment_documentation
|
Comprehensive.
|
gptkbp:environment_features
|
Simple dynamics.
|
gptkbp:environment_feedback_loop
|
Present.
|
gptkbp:environment_id
|
Cart Pole-v0.
|
gptkbp:environment_limitations
|
Simple dynamics.
|
gptkbp:environment_maintenance
|
Active.
|
gptkbp:environment_popularity
|
High.
|
gptkbp:environment_reset
|
Random.
|
gptkbp:environment_scalability
|
Limited.
|
gptkbp:environment_simplicity
|
High.
|
gptkbp:environment_type
|
gptkb:Control
|
gptkbp:environment_updates
|
Regular.
|
gptkbp:environment_updates_frequency
|
Regular.
|
gptkbp:environment_use_cases
|
Reinforcement Learning experiments.
|
gptkbp:environment_variability
|
Low.
|
gptkbp:evaluates
|
Average reward.
|
gptkbp:feedback_mechanism
|
Reward system.
|
gptkbp:goal
|
Keep the pole balanced for as long as possible.
|
gptkbp:has_function
|
Reset to a random state.
|
https://www.w3.org/2000/01/rdf-schema#label
|
Cart Pole-v0
|
gptkbp:initial_state
|
Randomly generated.
|
gptkbp:is_analyzed_in
|
gptkb:machine_learning
|
gptkbp:is_explored_in
|
Epsilon-greedy.
|
gptkbp:is_implemented_in
|
Python.
Numerous.
|
gptkbp:is_recommended_for
|
Newcomers to RL.
|
gptkbp:is_taught_in
|
Varies.
|
gptkbp:latest_version
|
0.1.0
|
gptkbp:learning_algorithms
|
Q-learning.
|
gptkbp:library
|
Gym.
|
gptkbp:max_episode_steps
|
gptkb:200
|
gptkbp:observation_space
|
Continuous
|
gptkbp:observation_space_size
|
4.
|
gptkbp:performance
|
Varies by agent.
|
gptkbp:pole_angular_velocity
|
Continuous.
|
gptkbp:pole_length
|
0.5 meters.
|
gptkbp:pole_position
|
Continuous.
|
gptkbp:policy_type
|
Stochastic.
|
gptkbp:prize_pool
|
1 for every timestep the pole remains upright.
|
gptkbp:release_date
|
gptkb:2016
|
gptkbp:reward_function
|
Simple.
|
gptkbp:reward_shaping
|
Not used.
|
gptkbp:similar_environments
|
Mountain Car-v0.
|
gptkbp:simulation_speed
|
Real-time.
|
gptkbp:state
|
Vector.
|
gptkbp:state_normalization
|
Yes.
|
gptkbp:state_space
|
4-dimensional vector.
|
gptkbp:state_transition
|
Deterministic.
|
gptkbp:termination_condition
|
Pole angle exceeds 15 degrees.
|
gptkbp:training
|
Varies by algorithm.
|
gptkbp:tutorials
|
Yes.
|
gptkbp:user_base
|
Large.
|
gptkbp:user_feedback
|
Positive.
|
gptkbp:user_interface
|
Command line.
|
gptkbp:bfsParent
|
gptkb:Cart_Pole-v1
gptkb:Mountain_Car-v0
|
gptkbp:bfsLayer
|
5
|