Cube Robot Archives - The Construct

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part5

In this new ROS Project you are going to learn Step-by-Step how to create a robot cube that moves and that it learns to move using OpenAI environment.

In this fifth and last video of the cube series, we create the Robot environment for OpenAI Gym for the moving cube. You will learn all the details on why we do it like this and we finally execute the script to make the robot cube learn how to walk in one direction.

For any suggestion on the next AI project that we could do, please leave us a comment, we will be happy to hear your ideas :).

OpenAi Course

Moving Cube Git:
https://bitbucket.org/theconstructcore/moving_cube/src/master/
Moving Cube Training AI Git:
https://bitbucket.org/theconstructcore/moving_cube_ai/src/master/

Step 0. If you didn’t follow up

This is a series posts using ROS Development Studio(ROSDS). In case you haven’t had an account yet, you can create one for free here.

You can find the last post down below:

https://www.theconstruct.ai/ros-projects-openai-moving-cube-robot-gazebo-step-step-part4/

Step 1. Create gym environment file

We’ll start by creating a file called old_way_moving_cube_env.py under the my_moving_cube_training_pkg/script directory with the following content

import gym
import rospy
import time
import numpy as np
import math
import copy
from gym import utils, spaces
import numpy
from std_msgs.msg import Float64
from sensor_msgs.msg import JointState
from rosgraph_msgs.msg import Clock
from nav_msgs.msg import Odometry
from gazebo_connection import GazeboConnection
from controllers_connection import ControllersConnection

from gym.utils import seeding
from gym.envs.registration import register

from geometry_msgs.msg import Point
from tf.transformations import euler_from_quaternion

reg = register(
    id='MyOldMovingCube-v0',
    entry_point='old_way_moving_cube_env:MyOldMovingCubeEnv',
    timestep_limit=1000,
    )

class MyOldMovingCubeEnv(gym.Env):

    def __init__(self):
        
        

        number_actions = rospy.get_param('/moving_cube/n_actions')
        self.action_space = spaces.Discrete(number_actions)
        
        self._seed()
        
        #get configuration parameters
        self.init_roll_vel = rospy.get_param('/moving_cube/init_roll_vel')
        
        # Actions
        self.roll_speed_fixed_value = rospy.get_param('/moving_cube/roll_speed_fixed_value')
        self.roll_speed_increment_value = rospy.get_param('/moving_cube/roll_speed_increment_value')
        
        self.start_point = Point()
        self.start_point.x = rospy.get_param("/moving_cube/init_cube_pose/x")
        self.start_point.y = rospy.get_param("/moving_cube/init_cube_pose/y")
        self.start_point.z = rospy.get_param("/moving_cube/init_cube_pose/z")
        
        # Done
        self.max_pitch_angle = rospy.get_param('/moving_cube/max_pitch_angle')
        
        # Rewards
        self.move_distance_reward_weight = rospy.get_param("/moving_cube/move_distance_reward_weight")
        self.y_linear_speed_reward_weight = rospy.get_param("/moving_cube/y_linear_speed_reward_weight")
        self.y_axis_angle_reward_weight = rospy.get_param("/moving_cube/y_axis_angle_reward_weight")
        self.end_episode_points = rospy.get_param("/moving_cube/end_episode_points")

        # stablishes connection with simulator
        self.gazebo = GazeboConnection()
        self.controllers_list = ['joint_state_controller',
                         'inertia_wheel_roll_joint_velocity_controller'
                         ]
        self.controllers_object = ControllersConnection(namespace="moving_cube",
                                                        controllers_list=self.controllers_list)


        self.gazebo.unpauseSim()
        self.controllers_object.reset_controllers()
        self.check_all_sensors_ready()

        rospy.Subscriber("/moving_cube/joint_states", JointState, self.joints_callback)
        rospy.Subscriber("/moving_cube/odom", Odometry, self.odom_callback)
        
        
        self._roll_vel_pub = rospy.Publisher('/moving_cube/inertia_wheel_roll_joint_velocity_controller/command', Float64, queue_size=1)

        self.check_publishers_connection()
        
        self.gazebo.pauseSim()

    
    def _seed(self, seed=None): #overriden function
        self.np_random, seed = seeding.np_random(seed)
        return [seed]

    def _step(self, action):#overriden function

        self.gazebo.unpauseSim()
        self.set_action(action)
        self.gazebo.pauseSim()
        obs = self._get_obs()
        done = self._is_done(obs)
        info = {}
        reward = self.compute_reward(obs, done)
        simplified_obs = self.convert_obs_to_state(obs)

        return simplified_obs, reward, done, info
        
    
    def _reset(self):

        self.gazebo.unpauseSim()
        self.controllers_object.reset_controllers()
        self.check_all_sensors_ready()
        self.set_init_pose()
        self.gazebo.pauseSim()
        self.gazebo.resetSim()
        self.gazebo.unpauseSim()
        self.controllers_object.reset_controllers()
        self.check_all_sensors_ready()
        self.gazebo.pauseSim()
        self.init_env_variables()
        obs = self._get_obs()
        simplified_obs = self.convert_obs_to_state(obs)

        return simplified_obs
        
        
    def init_env_variables(self):
        """
        Inits variables needed to be initialised each time we reset at the start
        of an episode.
        :return:
        """
        self.total_distance_moved = 0.0
        self.current_y_distance = self.get_y_dir_distance_from_start_point(self.start_point)
        self.roll_turn_speed = rospy.get_param('/moving_cube/init_roll_vel')
        
    def _is_done(self, observations):

        pitch_angle = observations[3]

        if abs(pitch_angle) > self.max_pitch_angle:
            rospy.logerr("WRONG Cube Pitch Orientation==>" + str(pitch_angle))
            done = True
        else:
            rospy.logdebug("Cube Pitch Orientation Ok==>" + str(pitch_angle))
            done = False

        return done

    def set_action(self, action):

        # We convert the actions to speed movements to send to the parent class CubeSingleDiskEnv
        if action == 0:# Move Speed Wheel Forwards
            self.roll_turn_speed = self.roll_speed_fixed_value
        elif action == 1:# Move Speed Wheel Backwards
            self.roll_turn_speed = self.roll_speed_fixed_value
        elif action == 2:# Stop Speed Wheel
            self.roll_turn_speed = 0.0
        elif action == 3:# Increment Speed
            self.roll_turn_speed += self.roll_speed_increment_value
        elif action == 4:# Decrement Speed
            self.roll_turn_speed -= self.roll_speed_increment_value

        # We clamp Values to maximum
        rospy.logdebug("roll_turn_speed before clamp=="+str(self.roll_turn_speed))
        self.roll_turn_speed = numpy.clip(self.roll_turn_speed,
                                          -1*self.roll_speed_fixed_value,
                                          self.roll_speed_fixed_value)
        rospy.logdebug("roll_turn_speed after clamp==" + str(self.roll_turn_speed))

        # We tell the OneDiskCube to spin the RollDisk at the selected speed
        self.move_joints(self.roll_turn_speed)
        
        
    def _get_obs(self):
        """
        Here we define what sensor data defines our robots observations
        To know which Variables we have acces to, we need to read the
        MyCubeSingleDiskEnv API DOCS
        :return:
        """

        # We get the orientation of the cube in RPY
        roll, pitch, yaw = self.get_orientation_euler()

        # We get the distance from the origin
        y_distance = self.get_y_dir_distance_from_start_point(self.start_point)

        # We get the current speed of the Roll Disk
        current_disk_roll_vel = self.get_roll_velocity()

        # We get the linear speed in the y axis
        y_linear_speed = self.get_y_linear_speed()

        cube_observations = [
            round(current_disk_roll_vel, 0),
            round(y_distance, 1),
            round(roll, 1),
            round(pitch, 1),
            round(y_linear_speed,1),
            round(yaw, 1),
        ]

        return cube_observations
        
        
    def get_orientation_euler(self):
        # We convert from quaternions to euler
        orientation_list = [self.odom.pose.pose.orientation.x,
                            self.odom.pose.pose.orientation.y,
                            self.odom.pose.pose.orientation.z,
                            self.odom.pose.pose.orientation.w]
    
        roll, pitch, yaw = euler_from_quaternion(orientation_list)
        return roll, pitch, yaw
    
    def get_roll_velocity(self):
        # We get the current joint roll velocity
        roll_vel = self.joints.velocity[0]
        return roll_vel
    
    def get_y_linear_speed(self):
        # We get the current joint roll velocity
        y_linear_speed = self.odom.twist.twist.linear.y
        return y_linear_speed
        
        
    def get_y_dir_distance_from_start_point(self, start_point):
        """
        Calculates the distance from the given point and the current position
        given by odometry. In this case the increase or decrease in y.
        :param start_point:
        :return:
        """
        y_dist_dir = self.odom.pose.pose.position.y - start_point.y
    
        return y_dist_dir

    def compute_reward(self, observations, done):

        if not done:

            y_distance_now = observations[1]
            delta_distance = y_distance_now - self.current_y_distance
            rospy.logdebug("y_distance_now=" + str(y_distance_now)+", current_y_distance=" + str(self.current_y_distance))
            rospy.logdebug("delta_distance=" + str(delta_distance))
            reward_distance = delta_distance * self.move_distance_reward_weight
            self.current_y_distance = y_distance_now

            y_linear_speed = observations[4]
            rospy.logdebug("y_linear_speed=" + str(y_linear_speed))
            reward_y_axis_speed = y_linear_speed * self.y_linear_speed_reward_weight

            # Negative Reward for yaw different from zero.
            yaw_angle = observations[5]
            rospy.logdebug("yaw_angle=" + str(yaw_angle))
            # Worst yaw is 90 and 270 degrees, best 0 and 180. We use sin function for giving reward.
            sin_yaw_angle = math.sin(yaw_angle)
            rospy.logdebug("sin_yaw_angle=" + str(sin_yaw_angle))
            reward_y_axis_angle = -1 * abs(sin_yaw_angle) * self.y_axis_angle_reward_weight


            # We are not intereseted in decimals of the reward, doesnt give any advatage.
            reward = round(reward_distance, 0) + round(reward_y_axis_speed, 0) + round(reward_y_axis_angle, 0)
            rospy.logdebug("reward_distance=" + str(reward_distance))
            rospy.logdebug("reward_y_axis_speed=" + str(reward_y_axis_speed))
            rospy.logdebug("reward_y_axis_angle=" + str(reward_y_axis_angle))
            rospy.logdebug("reward=" + str(reward))
        else:
            reward = -1*self.end_episode_points

        return reward


    def joints_callback(self, data):
        self.joints = data

    def odom_callback(self, data):
        self.odom = data
        
        
    def check_all_sensors_ready(self):
        self.check_joint_states_ready()
        self.check_odom_ready()
        rospy.logdebug("ALL SENSORS READY")

    def check_joint_states_ready(self):
        self.joints = None
        while self.joints is None and not rospy.is_shutdown():
            try:
                self.joints = rospy.wait_for_message("/moving_cube/joint_states", JointState, timeout=1.0)
                rospy.logdebug("Current moving_cube/joint_states READY=>" + str(self.joints))

            except:
                rospy.logerr("Current moving_cube/joint_states not ready yet, retrying for getting joint_states")
        return self.joints

    def check_odom_ready(self):
        self.odom = None
        while self.odom is None and not rospy.is_shutdown():
            try:
                self.odom = rospy.wait_for_message("/moving_cube/odom", Odometry, timeout=1.0)
                rospy.logdebug("Current /moving_cube/odom READY=>" + str(self.odom))

            except:
                rospy.logerr("Current /moving_cube/odom not ready yet, retrying for getting odom")

        return self.odom

    def check_publishers_connection(self):
        """
        Checks that all the publishers are working
        :return:
        """
        rate = rospy.Rate(10)  # 10hz
        while (self._roll_vel_pub.get_num_connections() == 0 and not rospy.is_shutdown()):
            rospy.logdebug("No susbribers to _roll_vel_pub yet so we wait and try again")
            try:
                rate.sleep()
            except rospy.ROSInterruptException:
                # This is to avoid error when world is rested, time when backwards.
                pass
        rospy.logdebug("_base_pub Publisher Connected")

        rospy.logdebug("All Publishers READY")
        
    def move_joints(self, roll_speed):
        joint_speed_value = Float64()
        joint_speed_value.data = roll_speed
        rospy.logdebug("Single Disk Roll Velocity>>" + str(joint_speed_value))
        self._roll_vel_pub.publish(joint_speed_value)
        self.wait_until_roll_is_in_vel(joint_speed_value.data)
    
    def wait_until_roll_is_in_vel(self, velocity):
    
        rate = rospy.Rate(10)
        start_wait_time = rospy.get_rostime().to_sec()
        end_wait_time = 0.0
        epsilon = 0.1
        v_plus = velocity + epsilon
        v_minus = velocity - epsilon
        while not rospy.is_shutdown():
            joint_data = self.check_joint_states_ready()
            roll_vel = joint_data.velocity[0]
            rospy.logdebug("VEL=" + str(roll_vel) + ", ?RANGE=[" + str(v_minus) + ","+str(v_plus)+"]")
            are_close = (roll_vel <= v_plus) and (roll_vel > v_minus)
            if are_close:
                rospy.logdebug("Reached Velocity!")
                end_wait_time = rospy.get_rostime().to_sec()
                break
            rospy.logdebug("Not there yet, keep waiting...")
            rate.sleep()
        delta_time = end_wait_time- start_wait_time
        rospy.logdebug("[Wait Time=" + str(delta_time)+"]")
        return delta_time


    def set_init_pose(self):
        """Sets the Robot in its init pose
        """
        self.move_joints(self.init_roll_vel)

        return True
        
        
    def convert_obs_to_state(self,observations):
        """
        Converts the observations used for reward and so on to the essentials for the robot state
        In this case we only need the orientation of the cube and the speed of the disc.
        The distance doesnt condition at all the actions
        """
        disk_roll_vel = observations[0]
        y_linear_speed = observations[4]
        yaw_angle = observations[5]
    
        state_converted = [disk_roll_vel, y_linear_speed, yaw_angle]
    
        return state_converted

From line 22 to line 26 is the most important part of the file. You have to register your environment to OpenAI gym.

In the _get_obs() function in line 169 to line 198, the observation is measured from the simulation and will be used to generate a state.

The set_action() function in line 144 to line 166 executes the action decided by the q learning algorithm.

From line 88 to 188, the code defines what should the environment do when the algorithm runs into step or reset state.

The reward and checking if the training session is done or not is done in the __step() function.

In _reset() function, the environment will at first pause the gazebo simulation and reset the controller. In order to connect to gazebo and controller, we create a file called gazebo_connection.py under the same folder with the following content

#!/usr/bin/env python

import rospy
from std_srvs.srv import Empty
from gazebo_msgs.msg import ODEPhysics
from gazebo_msgs.srv import SetPhysicsProperties, SetPhysicsPropertiesRequest
from std_msgs.msg import Float64
from geometry_msgs.msg import Vector3

class GazeboConnection():
    
    def __init__(self):
        
        self.unpause = rospy.ServiceProxy('/gazebo/unpause_physics', Empty)
        self.pause = rospy.ServiceProxy('/gazebo/pause_physics', Empty)
        self.reset_proxy = rospy.ServiceProxy('/gazebo/reset_simulation', Empty)

        # Setup the Gravity Controle system
        service_name = '/gazebo/set_physics_properties'
        rospy.logdebug("Waiting for service " + str(service_name))
        rospy.wait_for_service(service_name)
        rospy.logdebug("Service Found " + str(service_name))

        self.set_physics = rospy.ServiceProxy(service_name, SetPhysicsProperties)
        self.init_values()
        # We always pause the simulation, important for legged robots learning
        self.pauseSim()

    def pauseSim(self):
        rospy.wait_for_service('/gazebo/pause_physics')
        try:
            self.pause()
        except rospy.ServiceException as e:
            print ("/gazebo/pause_physics service call failed")
        
    def unpauseSim(self):
        rospy.wait_for_service('/gazebo/unpause_physics')
        try:
            self.unpause()
        except rospy.ServiceException as e:
            print ("/gazebo/unpause_physics service call failed")
        
    def resetSim(self):
        rospy.wait_for_service('/gazebo/reset_simulation')
        try:
            self.reset_proxy()
        except rospy.ServiceException as e:
            print ("/gazebo/reset_simulation service call failed")

    def resetWorld(self):
        rospy.wait_for_service('/gazebo/reset_world')
        try:
            self.reset_proxy()
        except rospy.ServiceException as e:
            print ("/gazebo/reset_world service call failed")

    def init_values(self):

        rospy.wait_for_service('/gazebo/reset_simulation')
        try:
            # reset_proxy.call()
            self.reset_proxy()
        except rospy.ServiceException as e:
            print ("/gazebo/reset_simulation service call failed")

        self._time_step = Float64(0.001)
        self._max_update_rate = Float64(1000.0)

        self._gravity = Vector3()
        self._gravity.x = 0.0
        self._gravity.y = 0.0
        self._gravity.z = -9.81

        self._ode_config = ODEPhysics()
        self._ode_config.auto_disable_bodies = False
        self._ode_config.sor_pgs_precon_iters = 0
        self._ode_config.sor_pgs_iters = 50
        self._ode_config.sor_pgs_w = 1.3
        self._ode_config.sor_pgs_rms_error_tol = 0.0
        self._ode_config.contact_surface_layer = 0.001
        self._ode_config.contact_max_correcting_vel = 0.0
        self._ode_config.cfm = 0.0
        self._ode_config.erp = 0.2
        self._ode_config.max_contacts = 20

        self.update_gravity_call()

    def update_gravity_call(self):

        self.pauseSim()

        set_physics_request = SetPhysicsPropertiesRequest()
        set_physics_request.time_step = self._time_step.data
        set_physics_request.max_update_rate = self._max_update_rate.data
        set_physics_request.gravity = self._gravity
        set_physics_request.ode_config = self._ode_config

        rospy.logdebug(str(set_physics_request.gravity))

        result = self.set_physics(set_physics_request)
        rospy.logdebug("Gravity Update Result==" + str(result.success) + ",message==" + str(result.status_message))

        self.unpauseSim()

    def change_gravity(self, x, y, z):
        self._gravity.x = x
        self._gravity.y = y
        self._gravity.z = z

        self.update_gravity_call()

Then we create another file called controllers_connection.py with the following content

#!/usr/bin/env python

import rospy
import time
from controller_manager_msgs.srv import SwitchController, SwitchControllerRequest, SwitchControllerResponse

class ControllersConnection():
    
    def __init__(self, namespace, controllers_list):

        self.controllers_list = controllers_list
        self.switch_service_name = '/'+namespace+'/controller_manager/switch_controller'
        self.switch_service = rospy.ServiceProxy(self.switch_service_name, SwitchController)

    def switch_controllers(self, controllers_on, controllers_off, strictness=1):
        """
        Give the controllers you want to switch on or off.
        :param controllers_on: ["name_controler_1", "name_controller2",...,"name_controller_n"]
        :param controllers_off: ["name_controler_1", "name_controller2",...,"name_controller_n"]
        :return:
        """
        rospy.wait_for_service(self.switch_service_name)

        try:
            switch_request_object = SwitchControllerRequest()
            switch_request_object.start_controllers = controllers_on
            switch_request_object.start_controllers = controllers_off
            switch_request_object.strictness = strictness

            switch_result = self.switch_service(switch_request_object)
            """
            [controller_manager_msgs/SwitchController]
            int32 BEST_EFFORT=1
            int32 STRICT=2
            string[] start_controllers
            string[] stop_controllers
            int32 strictness
            ---
            bool ok
            """
            rospy.logdebug("Switch Result==>"+str(switch_result.ok))

            return switch_result.ok

        except rospy.ServiceException as e:
            print (self.switch_service_name+" service call failed")

            return None

    def reset_controllers(self):
        """
        We turn on and off the given controllers
        :param controllers_reset: ["name_controler_1", "name_controller2",...,"name_controller_n"]
        :return:
        """
        reset_result = False

        result_off_ok = self.switch_controllers(controllers_on = [],
                                controllers_off = self.controllers_list)

        rospy.logdebug("Deactivated Controlers")

        if result_off_ok:
            rospy.logdebug("Activating Controlers")
            result_on_ok = self.switch_controllers(controllers_on=self.controllers_list,
                                                    controllers_off=[])
            if result_on_ok:
                rospy.logdebug("Controllers Reseted==>"+str(self.controllers_list))
                reset_result = True
            else:
                rospy.logdebug("result_on_ok==>" + str(result_on_ok))
        else:
            rospy.logdebug("result_off_ok==>" + str(result_off_ok))

        return reset_result

    def update_controllers_list(self, new_controllers_list):

        self.controllers_list = new_controllers_list

Step 2. Start training

Now you have all the script you need for training, let’s create a launch file to launch the training under my_moving_cube_training_pkg/launch with the name start_training.launch

<launch>
    <rosparam command="load" file="$(find my_moving_cube_training_pkg)/config/one_disk_walk_openai_params.yaml" />
    <!-- Launch the training system -->
    <node pkg="my_moving_cube_training_pkg" name="movingcube_gym" type="oldway_start_training.py" output="screen"/>
</launch>

If you already closed the simulation, please start it again from Simulations->select launch file->main.launch

Then run the following command to launch the training.

cd ~/catkin_ws
source devel/setup.bash
roslaunch my_moving_cube_training_pkg start_training.launch

You should see the cube robot now moving around to find the best way to move, you can play with different parameters to optimize the training.

If you are interested in this topic, please do not forget to check our OpenAI course at Robot Ignite Academy where you can learn how to create the gym environment for different robots!

Edit by: Tony Huang

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part4

In this new ROS Project you are going to learn Step-by-Step how to create a robot cube that moves and that it learns to move using OpenAI environment.

In this fourth video we continue we talk about the first script we need to do reinforcement learning with OpenAI. Specifically the main script where we import the Robot Environment that we will define in the next video and we will use Qlearn.

OpenAi Course

Moving Cube Git:
https://bitbucket.org/theconstructcore/moving_cube/src/master/
Moving Cube Training AI Git: https://bitbucket.org/theconstructcore/moving_cube_ai/src/master/

If you didn’t follow up, please check the link below for the last post

https://www.theconstruct.ai/ros-projects-openai-moving-cube-robot-gazebo-step-step-part3/

#!/usr/bin/env python

'''
    Training code made by Ricardo Tellez <rtellez@theconstructsim.com>
    Based on many other examples around Internet
    Visit our website at www.theconstruct.ai
'''
import gym
import numpy
import time
import qlearn
from gym import wrappers

# ROS packages required
import rospy
import rospkg

# import our training environment
import old_way_moving_cube_env


if __name__ == '__main__':
    
    rospy.init_node('movingcube_gym', anonymous=True, log_level=rospy.WARN)

    # Create the Gym environment
    env = gym.make('OldMovingCube-v0')
    rospy.loginfo ( "Gym environment done")
        
    # Set the logging system
    rospack = rospkg.RosPack()
    pkg_path = rospack.get_path('moving_cube_training_pkg')
    outdir = pkg_path + '/training_results'
    env = wrappers.Monitor(env, outdir, force=True) 
    rospy.loginfo ( "Monitor Wrapper started")

    last_time_steps = numpy.ndarray(0)

    # Loads parameters from the ROS param server
    # Parameters are stored in a yaml file inside the config directory
    # They are loaded at runtime by the launch file
    Alpha = rospy.get_param("/moving_cube/alpha")
    Epsilon = rospy.get_param("/moving_cube/epsilon")
    Gamma = rospy.get_param("/moving_cube/gamma")
    epsilon_discount = rospy.get_param("/moving_cube/epsilon_discount")
    nepisodes = rospy.get_param("/moving_cube/nepisodes")
    nsteps = rospy.get_param("/moving_cube/nsteps")

    running_step = rospy.get_param("/moving_cube/running_step")

    # Initialises the algorithm that we are going to use for learning
    qlearn = qlearn.QLearn(actions=range(env.action_space.n),
                    alpha=Alpha, gamma=Gamma, epsilon=Epsilon)
    initial_epsilon = qlearn.epsilon

    start_time = time.time()
    highest_reward = 0

    # Starts the main training loop: the one about the episodes to do
    for x in range(nepisodes):
        rospy.logdebug("############### START EPISODE=>" + str(x))
        
        cumulated_reward = 0  
        done = False
        if qlearn.epsilon > 0.05:
            qlearn.epsilon *= epsilon_discount
        
        # Initialize the environment and get first state of the robot
        observation = env.reset()
        state = ''.join(map(str, observation))
        
        episode_time = rospy.get_rostime().to_sec()
        # for each episode, we test the robot for nsteps
        for i in range(nsteps):
            rospy.loginfo("############### Start Step=>"+str(i))
            # Pick an action based on the current state
            action = qlearn.chooseAction(state)
            rospy.loginfo ("Next action is:%d", action)
            # Execute the action in the environment and get feedback
            observation, reward, done, info = env.step(action)

            rospy.loginfo(str(observation) + " " + str(reward))
            cumulated_reward += reward
            if highest_reward < cumulated_reward:
                highest_reward = cumulated_reward

            nextState = ''.join(map(str, observation))

            # Make the algorithm learn based on the results
            rospy.logwarn("############### state we were=>" + str(state))
            rospy.logwarn("############### action that we took=>" + str(action))
            rospy.logwarn("############### reward that action gave=>" + str(reward))
            rospy.logwarn("############### State in which we will start nect step=>" + str(nextState))
            qlearn.learn(state, action, reward, nextState)

            if not(done):
                state = nextState
            else:
                rospy.loginfo ("DONE")
                last_time_steps = numpy.append(last_time_steps, [int(i + 1)])
                break
            rospy.loginfo("############### END Step=>" + str(i))
            #raw_input("Next Step...PRESS KEY")
            #rospy.sleep(2.0)
        m, s = divmod(int(time.time() - start_time), 60)
        h, m = divmod(m, 60)
        rospy.logwarn ( ("EP: "+str(x+1)+" - [alpha: "+str(round(qlearn.alpha,2))+" - gamma: "+str(round(qlearn.gamma,2))+" - epsilon: "+str(round(qlearn.epsilon,2))+"] - Reward: "+str(cumulated_reward)+"     Time: %d:%02d:%02d" % (h, m, s)))

    
    rospy.loginfo ( ("\n|"+str(nepisodes)+"|"+str(qlearn.alpha)+"|"+str(qlearn.gamma)+"|"+str(initial_epsilon)+"*"+str(epsilon_discount)+"|"+str(highest_reward)+"| PICTURE |"))

    l = last_time_steps.tolist()
    l.sort()

    #print("Parameters: a="+str)
    rospy.loginfo("Overall score: {:0.2f}".format(last_time_steps.mean()))
    rospy.loginfo("Best 100 score: {:0.2f}".format(reduce(lambda x, y: x + y, l[-100:]) / len(l[-100:])))

    env.close()

This the script for training.

From line 42 to line 49, we read the parameters for the q-learn algorithm from the parameter server which we’ll discuss later.

The training process is done from line 59 to line 108. For each step, the algorithm decides which action to take based on the current state, then it measures the observation, decides if the simulation is done and calculates rewards. It will keep learning to optimize the reward it gets.

The implementation of the q-learn algorithm was done by Victor Mayoral Vilches.

'''
Q-learning approach for different RL problems
as part of the basic series on reinforcement learning @
            							An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.							
        https://github.com/vmayoral/basic_reinforcement_learning
        362 forks.
        1,138 stars.
        3 open issues.
        
            Recent commits:
                                                                        Update tutorial1 to work with Python3Signed-off-by: Víctor Mayoral Vilches <v.mayoralv@gmail.com>, Víctor Mayoral Vilches                    
                                                            Minor changes to fix headlines, GitHub                    
                                                            Rendered TeX expressions in 50b3013c682b957e0ae851bad21f76d399c5059d, texify[bot]                    
                                                            Add a few additional bits to 11th's README, Victor Mayoral Vilches                    
                                                            Rendered TeX expressions in 2f6245c5a9825f37f06f3c84bb6784217648f4e6, texify[bot]                    
                            
        
    
 
Inspired by https://gym.openai.com/evaluations/eval_kWknKOkPQ7izrixdhriurA
 
        @author: Victor Mayoral Vilches <victor@erlerobotics.com>
'''
import random

class QLearn:
    def __init__(self, actions, epsilon, alpha, gamma):
        self.q = {}
        self.epsilon = epsilon  # exploration constant
        self.alpha = alpha      # discount constant
        self.gamma = gamma      # discount factor
        self.actions = actions

    def getQ(self, state, action):
        return self.q.get((state, action), 0.0)

    def learnQ(self, state, action, reward, value):
        '''
        Q-learning:
            Q(s, a) += alpha * (reward(s,a) + max(Q(s') - Q(s,a))            
        '''
        oldv = self.q.get((state, action), None)
        if oldv is None:
            self.q[(state, action)] = reward
        else:
            self.q[(state, action)] = oldv + self.alpha * (value - oldv)

    def chooseAction(self, state, return_q=False):
        q = [self.getQ(state, a) for a in self.actions]
        maxQ = max(q)

        if random.random() < self.epsilon:
            minQ = min(q); mag = max(abs(minQ), abs(maxQ))
            # add random values to all the actions, recalculate maxQ
            q = [q[i] + random.random() * mag - .5 * mag for i in range(len(self.actions))] 
            maxQ = max(q)

        count = q.count(maxQ)
        # In case there're several state-action max values 
        # we select a random one among them
        if count > 1:
            best = [i for i in range(len(self.actions)) if q[i] == maxQ]
            i = random.choice(best)
        else:
            i = q.index(maxQ)

        action = self.actions[i]        
        if return_q: # if they want it, give it!
            return action, q
        return action

    def learn(self, state1, action1, reward, state2):
        maxqnew = max([self.getQ(state2, a) for a in self.actions])
        self.learnQ(state1, action1, reward, reward + self.gamma*maxqnew)

We put the parameter for the q-learn separately in a file called one_disk_walk_openai_params.yaml under the config folder, so you can tweak the parameters much easier.

moving_cube: #namespace

    running_step: 0.04 # amount of time the control will be executed
    pos_step: 0.016     # increment in position for each command
    
    #qlearn parameters
    alpha: 0.1
    gamma: 0.7
    epsilon: 0.9
    epsilon_discount: 0.999
    nepisodes: 500
    nsteps: 1000
    number_splits: 10 #set to change the number of state splits for the continuous problem and also the number of env_variable splits

    running_step: 0.06 # Time for each step
    wait_time: 0.1 # Time to wait in the reset phases

    n_actions: 5 # We have 3 actions

    speed_step: 1.0 # Time to wait in the reset phases

    init_roll_vel: 0.0 # Initial speed of the Roll Disk

    roll_speed_fixed_value: 100.0 # Speed at which it will move forwards or backwards
    roll_speed_increment_value: 10.0 # Increment that could be done in each step

    max_distance: 2.0 # Maximum distance allowed for the RobotCube
    max_pitch_angle: 0.2 # Maximum Angle radians in Pitch that we allow before terminating episode
    max_yaw_angle: 0.1 # Maximum yaw angle deviation, after that it starts getting negative rewards

    init_cube_pose:
      x: 0.0
      y: 0.0
      z: 0.0

    end_episode_points: 1000 # Points given when ending an episode

    move_distance_reward_weight: 1000.0 # Multiplier for the moved distance reward, Ex: inc_d = 0.1 --> 100points
    y_linear_speed_reward_weight: 1000.0 # Multiplier for moving fast in the y Axis
    y_axis_angle_reward_weight: 1000.0 # Multiplier of angle of yaw, to keep it straight

It might take a lot of time to tune the parameters. In ROSDS, we offer the gym computer feature to help you run training with different parameters parallelly. If you are interested, please check our paid program.

If you want to learn more applications with OpenAI in ROS, please check our OpenAI course in the robot ignite academy.

Edit by Tony Huang.

[irp posts=”10198″ name=”ROS Projects OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part5″]

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part3

In this new ROS Project you are going to learn Step-by-Step how to create a moving cube and that it learns to move using OpenAI environment.

In this third video we continue with previous video setting up all the basics needed for moving the cube and getting sensor data for the reward and done functions

ROS Development Studio
OpenAi Course

Moving Cube Git:https://bitbucket.org/theconstructcore/moving_cube/src/master/
Moving Cube Training AI Git: https://bitbucket.org/theconstructcore/moving_cube_training/src/master/

Video On Quaternions to Euler:
https://www.theconstruct.ai/ros-qa-how-to-convert-quaternions-to-euler-angles/

This is the third post of this series. If you didn’t follow up, please check the link below.

https://www.theconstruct.ai/ros-projects-openai-moving-cube-robot-gazebo-step-step-part2/

Step 1. cube_rl_utils.py

In the last post, we’ve briefly walked through the last part of cube_rl_utils.py which tests the robot. Today, let’s dive deeper into the code.

#!/usr/bin/env python

import time
import rospy
import math
import copy
import numpy
from std_msgs.msg import Float64
from sensor_msgs.msg import JointState
from nav_msgs.msg import Odometry
from geometry_msgs.msg import Point
from tf.transformations import euler_from_quaternion

class CubeRLUtils(object):

    def __init__(self):

        self.check_all_sensors_ready()
        
        rospy.Subscriber("/moving_cube/joint_states", JointState, self.joints_callback)
        rospy.Subscriber("/moving_cube/odom", Odometry, self.odom_callback)

        self._roll_vel_pub = rospy.Publisher('/moving_cube/inertia_wheel_roll_joint_velocity_controller/command', Float64, queue_size=1)
        
        self.check_publishers_connection()
        
    
    def check_all_sensors_ready(self):
        self.disk_joints_data = None
        while self.disk_joints_data is None and not rospy.is_shutdown():
            try:
                self.disk_joints_data = rospy.wait_for_message("/moving_cube/joint_states", JointState, timeout=1.0)
                rospy.loginfo("Current moving_cube/joint_states READY=>"+str(self.disk_joints_data))

            except:
                rospy.logerr("Current moving_cube/joint_states not ready yet, retrying for getting joint_states")

        self.cube_odom_data = None
        while self.disk_joints_data is None and not rospy.is_shutdown():
            try:
                self.cube_odom_data = rospy.wait_for_message("/moving_cube/odom", Odometry, timeout=1.0)
                rospy.loginfo("Current /moving_cube/odom READY=>" + str(self.cube_odom_data))

            except:
                rospy.logerr("Current /moving_cube/odom not ready yet, retrying for getting odom")
        rospy.loginfo("ALL SENSORS READY")
        
    def check_publishers_connection(self):
        """
        Checks that all the publishers are working
        :return:
        """
        rate = rospy.Rate(10)  # 10hz
        while (self._roll_vel_pub.get_num_connections() == 0 and not rospy.is_shutdown()):
            rospy.loginfo("No susbribers to _roll_vel_pub yet so we wait and try again")
            try:
                rate.sleep()
            except rospy.ROSInterruptException:
                # This is to avoid error when world is rested, time when backwards.
                pass
        rospy.loginfo("_base_pub Publisher Connected")

        rospy.loginfo("All Publishers READY")
        
    def joints_callback(self, data):
        self.joints = data

    def odom_callback(self, data):
        self.odom = data
...

The first part of the script is nothing special but creating publisher, subscriber and checking all the sensor.

    # Reinforcement Learning Utility Code
    def move_joints(self, roll_speed):
        
        joint_speed_value = Float64()
        joint_speed_value.data = roll_speed
        rospy.loginfo("Single Disk Roll Velocity>>"+str(joint_speed_value))
        self._roll_vel_pub.publish(joint_speed_value)
        
    def get_cube_state(self):
        
        # We convert from quaternions to euler
        orientation_list = [self.odom.pose.pose.orientation.x,
                            self.odom.pose.pose.orientation.y,
                            self.odom.pose.pose.orientation.z,
                            self.odom.pose.pose.orientation.w]
                            
        roll, pitch, yaw = euler_from_quaternion(orientation_list)
        
        # We get the distance from the origin
        start_position = Point()
        start_position.x = 0.0
        start_position.y = 0.0
        start_position.z = 0.0
        
        distance = self.get_distance_from_point(start_position,
                                                self.odom.pose.pose.position)
        
        cube_state = [
                        round(self.joints.velocity[0],1),
                        round(distance,1),
                        round(roll,1),
                        round(pitch,1),
                        round(yaw,1)
                    ]
        
        return cube_state

    
    def observation_checks(self, cube_state):
        
        
        # MAximum distance to travel permited in meters from origin
        max_distance=2.0
        
        if (cube_state[1] > max_distance): 
            rospy.logerr("Cube Too Far==>"+str(cube_state[1]))
            done = True
        else:
            rospy.loginfo("Cube NOT Too Far==>"+str(cube_state[1]))
            done = False
        
        return done

    def get_distance_from_point(self, pstart, p_end):
        """
        Given a Vector3 Object, get distance from current position
        :param p_end:
        :return:
        """
        a = numpy.array((pstart.x, pstart.y, pstart.z))
        b = numpy.array((p_end.x, p_end.y, p_end.z))

        distance = numpy.linalg.norm(a - b)

        return distance

    def get_reward_for_observations(self, state):

        # We reward it for lower speeds and distance traveled
        
        speed = state[0]
        distance = state[1]

        # Positive Reinforcement
        reward_distance = distance * 10.0
        # Negative Reinforcement for magnitude of speed
        reward_for_efective_movement = -1 * abs(speed) 
        
        reward = reward_distance + reward_for_efective_movement

        rospy.loginfo("Reward_distance="+str(reward_distance))
        rospy.loginfo("Reward_for_efective_movement= "+str(reward_for_efective_movement))
        
        return reward

The second part is much more important. It prepares the elements for the reinforcement learning algorithm.

In the get_cube_state() function. We converted the sensor reading to cube state. We chose the joint velocity, distance, roll, pitch, yaw as the state. To get the roll, pitch, and yaw, we have to convert the odom from the quaternion to Euler angle.

We check if the simulation is done in the observation_checks() function and calculate reward in the get_reward_for_observations() function based on the distance the robot moved.

You can play with different parameters in the script to achieve a better reward. In the future, we will automate the learning process with the reinforcement learning algorithm.

Edit by: Tony Huang

[irp posts=”10079″ name=”ROS Projects OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part4″]

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part2

In this new ROS Project you are going to learn Step-by-Step how to create a moving cube and that it learns to move using OpenAI environment.

This second video is for learning the creation the basics of Reinforcement learning and how to connect to the various systems of the robot to get the state, perform actions and calculate rewards.

ROS Development Studio
OpenAi Course

Moving Cube Git:https://bitbucket.org/theconstructcore/moving_cube/src/master/
Moving Cube Training AI Git: https://bitbucket.org/theconstructcore/moving_cube_training/src/master/

If you didn’t follow up, please check the link below for the last post.

[irp posts=”9744″ name=”[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part1″]

Step 1. Clone the simulation

In order to make sure we have the same project. Please run the following command

cd ~/simulation_ws/src
git clone https://bitbucket.org/theconstructcore/moving_cube.git

NOTICE: please delete the previous code if you have problems to compile the code

In the /moving_cube/moving_cube_description/urdf/moving_cube.urdf file, please uncomment the following part. We’ll need this part to publish the odom topic.

    <gazebo>
        <plugin name="p3d_base_controller" filename="libgazebo_ros_p3d.so">
            <robotNamespace>/moving_cube</robotNamespace>
            <alwaysOn>true</alwaysOn>
            <updateRate>50.0</updateRate>
            <bodyName>cube_body</bodyName>
            <topicName>odom</topicName>
            <gaussianNoise>0.01</gaussianNoise>
            <frameName>world</frameName>
            <xyzOffsets>0 0 0</xyzOffsets>
            <rpyOffsets>0 0 0</rpyOffsets>
        </plugin>
    </gazebo>

Step 2. Create a training package

To train the robot, let’s create a package for training under the catkin_ws/src

cd ~/catkin_ws/src
catkin_create_pkg my_moving_cube_traning_pkg rospy

Then we’ll create a script folder inside the package and put a file called cube_rl_utils.py inside it with the following content

#!/usr/bin/env python

import time
import rospy
import math
import copy
import numpy
from std_msgs.msg import Float64
from sensor_msgs.msg import JointState
from nav_msgs.msg import Odometry
from geometry_msgs.msg import Point
from tf.transformations import euler_from_quaternion

class CubeRLUtils(object):

    def __init__(self):

        self.check_all_sensors_ready()
        
        rospy.Subscriber("/moving_cube/joint_states", JointState, self.joints_callback)
        rospy.Subscriber("/moving_cube/odom", Odometry, self.odom_callback)

        self._roll_vel_pub = rospy.Publisher('/moving_cube/inertia_wheel_roll_joint_velocity_controller/command', Float64, queue_size=1)
        
        self.check_publishers_connection()
        
    
    def check_all_sensors_ready(self):
        self.disk_joints_data = None
        while self.disk_joints_data is None and not rospy.is_shutdown():
            try:
                self.disk_joints_data = rospy.wait_for_message("/moving_cube/joint_states", JointState, timeout=1.0)
                rospy.loginfo("Current moving_cube/joint_states READY=>"+str(self.disk_joints_data))

            except:
                rospy.logerr("Current moving_cube/joint_states not ready yet, retrying for getting joint_states")

        self.cube_odom_data = None
        while self.disk_joints_data is None and not rospy.is_shutdown():
            try:
                self.cube_odom_data = rospy.wait_for_message("/moving_cube/odom", Odometry, timeout=1.0)
                rospy.loginfo("Current /moving_cube/odom READY=>" + str(self.cube_odom_data))

            except:
                rospy.logerr("Current /moving_cube/odom not ready yet, retrying for getting odom")
        rospy.loginfo("ALL SENSORS READY")
        
    def check_publishers_connection(self):
        """
        Checks that all the publishers are working
        :return:
        """
        rate = rospy.Rate(10)  # 10hz
        while (self._roll_vel_pub.get_num_connections() == 0 and not rospy.is_shutdown()):
            rospy.loginfo("No susbribers to _roll_vel_pub yet so we wait and try again")
            try:
                rate.sleep()
            except rospy.ROSInterruptException:
                # This is to avoid error when world is rested, time when backwards.
                pass
        rospy.loginfo("_base_pub Publisher Connected")

        rospy.loginfo("All Publishers READY")
        
    def joints_callback(self, data):
        self.joints = data

    def odom_callback(self, data):
        self.odom = data

    
    # Reinforcement Learning Utility Code
    def move_joints(self, roll_speed):
        
        joint_speed_value = Float64()
        joint_speed_value.data = roll_speed
        rospy.loginfo("Single Disk Roll Velocity>>"+str(joint_speed_value))
        self._roll_vel_pub.publish(joint_speed_value)
        
    def get_cube_state(self):
        
        # We convert from quaternions to euler
        orientation_list = [self.odom.pose.pose.orientation.x,
                            self.odom.pose.pose.orientation.y,
                            self.odom.pose.pose.orientation.z,
                            self.odom.pose.pose.orientation.w]
                            
        roll, pitch, yaw = euler_from_quaternion(orientation_list)
        
        # We get the distance from the origin
        start_position = Point()
        start_position.x = 0.0
        start_position.y = 0.0
        start_position.z = 0.0
        
        distance = self.get_distance_from_point(start_position,
                                                self.odom.pose.pose.position)
        
        cube_state = [
                        round(self.joints.velocity[0],1),
                        round(distance,1),
                        round(roll,1),
                        round(pitch,1),
                        round(yaw,1)
                    ]
        
        return cube_state

    
    def observation_checks(self, cube_state):
        
        
        # MAximum distance to travel permited in meters from origin
        max_distance=2.0
        
        if (cube_state[1] > max_distance): 
            rospy.logerr("Cube Too Far==>"+str(cube_state[1]))
            done = True
        else:
            rospy.loginfo("Cube NOT Too Far==>"+str(cube_state[1]))
            done = False
        
        return done

    def get_distance_from_point(self, pstart, p_end):
        """
        Given a Vector3 Object, get distance from current position
        :param p_end:
        :return:
        """
        a = numpy.array((pstart.x, pstart.y, pstart.z))
        b = numpy.array((p_end.x, p_end.y, p_end.z))

        distance = numpy.linalg.norm(a - b)

        return distance

    def get_reward_for_observations(self, state):

        # We reward it for lower speeds and distance traveled
        
        speed = state[0]
        distance = state[1]

        # Positive Reinforcement
        reward_distance = distance * 10.0
        # Negative Reinforcement for magnitude of speed
        reward_for_efective_movement = -1 * abs(speed) 
        
        reward = reward_distance + reward_for_efective_movement

        rospy.loginfo("Reward_distance="+str(reward_distance))
        rospy.loginfo("Reward_for_efective_movement= "+str(reward_for_efective_movement))
        
        return reward



def cube_rl_systems_test():
    
    rospy.init_node('cube_rl_systems_test_node', anonymous=True, log_level=rospy.INFO)
    cube_rl_utils_object = CubeRLUtils()
    
    rospy.loginfo("Moving to Speed==>80")
    cube_rl_utils_object.move_joints(roll_speed=80.0)
    time.sleep(2)
    rospy.loginfo("Moving to Speed==>-80")
    cube_rl_utils_object.move_joints(roll_speed=-80.0)
    time.sleep(2)
    rospy.loginfo("Moving to Speed==>0.0")
    cube_rl_utils_object.move_joints(roll_speed=0.0)
    time.sleep(2)
    
    cube_state = cube_rl_utils_object.get_cube_state()
    done = cube_rl_utils_object.observation_checks(cube_state)
    reward = cube_rl_utils_object.get_reward_for_observations(cube_state)
    
    rospy.loginfo("Done==>"+str(done))
    rospy.loginfo("Reward==>"+str(reward))
    

if __name__ == "__main__":
    cube_rl_systems_test()

In this post, we’ll focus on the cube_rl_systems_test() function. The function uses the class to move the cube, get the observation, calculate reward and check if it’s done. To run it, you have to run the simulation first. Please go to Simulations->Select launch file-> main.launch

NOTICE: You have to unpause the simulation by clicking the arrow key in the simulation window

Then you can run the following command to run the script

cd ~/catkin_ws/src/my_moving_cube_training_pkg/script
chmod +x cube_rl_utils.py
cd ~/catkin_ws
source devel_setup.bash
rosrun my_moving_cube_training_pkg cube_rl_utils.py

You should see the cube moving around and the reward and the done state is calculated.

Edit by: Tony Huang

[irp posts=”9976″ name=”ROS Projects OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part3″]

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part1

In this new ROS Project you are going to learn Step-by-Step how to create a moving cube and that it learns to move using OpenAI environment.

Part 1: Creation of the URDF and Control Systems
Part 2: Basics of Reinforcement learning and Connect to the various systems of the robot
Part 3: Setting up all the basics needed for moving the cube and getting sensor data
Part 4: The first script we need to do reinforcement learning with OpenAI
Part 5: Create the Robot environment for OpenAI Gym for the moving cube

Part 1

This first video is for learning the creation of the URDF and control systems.

Moving Cube Git: https://bitbucket.org/theconstructcore/moving_cube/src/master/

Step 1. Create a project in ROS Development Studio(ROSDS)

We’ll use ROSDS through this project in order to avoid setting up the environment, manage packages and etc. You can create a free account here if you haven’t had an account yet.

Step 2. Create package

Since this is a simulation, let’s create a package called my_moving_cube_description under the simulation_ws.

cd ~/simulation_ws/src
catkin_create_pkg my_moving_cube_description rospy

We’ll start by building the URDF description of the robot. To do that, we’ll create a new folder called urdf under the my_moving_cube_description directory and create a file called my_moving_cube.urdf inside it with the following initial content. The robot tag indicates the name of the robot – my_moving_cube.

<robot name="my_moving_cube">
...
</robot>

Then let’s create the first link inside the robot. This includes 3 parts :

inertial: It defines the physical property of the link. You can calculate the inertia of an object by using this tool: rosrun spawn_robot_tools_pkg inertial_calculator.py
collision: It defines the collision property when the object interacts with other objects in the simulation.
visual: It defines the visual property, how the object will visually show in the simulation.

    <link name="cube_body">
        <inertial>
            <origin xyz="0 0 0" rpy="0 0 0"/>
            <mass value="0.5" />
            <inertia ixx="0.00333333333333" ixy="0.0" ixz="0.0" iyy="0.00333333333333" iyz="0.0" izz="0.00333333333333"/>
        </inertial>
        <collision>
            <origin xyz="0 0 0" rpy="0 0 0"/>
            <geometry>
                <box size="0.2 0.2 0.2"/>
            </geometry>
        </collision>
	    <visual>
	      <geometry>
	        <box size="0.2 0.2 0.2"/>
	      </geometry>
	    </visual>
    </link>

You also need to define the material property after the link if you want to use it in gazebo. (NOTICE: the reference of the material property should have the same name as the link)

    <gazebo reference="cube_body">
        <kp>1000000.0</kp>
        <kd>1000000.0</kd>
        <mu1>1000000.0</mu1>
        <mu2>1000000.0</mu2>
        <material>Gazebo/Black</material>
    </gazebo>

Then we can create a spawn_moving_cube.launch file under the my_moving_cube_description/launch directory with the following content to spawn the cube.

<?xml version="1.0" encoding="UTF-8"?>
<launch>
    <include file="$(find spawn_robot_tools_pkg)/launch/spawn_robot_urdf.launch">
        <arg name="x" default="0.0" />
        <arg name="y" default="0.0" />
        <arg name="z" default="0.11" />
        <arg name="roll" default="0"/>
        <arg name="pitch" default="0"/>
        <arg name="yaw" default="0.0" />
        <arg name="urdf_robot_file" default="$(find my_moving_cube_description)/urdf/my_moving_cube.urdf" />
        <arg name="robot_name" default="my_moving_cube" />
    </include>
</launch>

Now, go to simulations->–Empty– to launch an empty world. Then go to Tools->shell to run the command

roslaunch my_moving_cube_description spawn_moving_cube.launch

You should see the cube now appears in the empty world like this

Similarly, we can add another link called inertia_whell_roll

    <link name="inertia_wheel_roll">
        <inertial>
            <origin xyz="0 0 0" rpy="0 0 0"/>
            <mass value="0.5" />
            <inertia ixx="0.000804166666667" ixy="0.0" ixz="0.0" iyy="0.000804166666667" iyz="0.0" izz="0.0016"/>
        </inertial>
        <collision>
            <origin xyz="0 0 0" rpy="0 0 0"/>
            <geometry>
                <cylinder radius="0.08" length="0.01"/>
            </geometry>
        </collision>
	    <visual>
	      <geometry>
	        <cylinder radius="0.08" length="0.01"/>
	      </geometry>
	    </visual>
    </link>
    
    <gazebo reference="inertia_wheel_roll">
        <kp>1000.0</kp>
        <kd>1000.0</kd>
        <mu1>0.5</mu1>
        <mu2>0.5</mu2>
        <material>Gazebo/Red</material>
    </gazebo>

Then we also need to define the joint type which connects these two links

    <joint name="inertia_wheel_roll_joint" type="continuous">
        <origin xyz="0.1 0.0 0.0" rpy="0 1.57 0"/>
        <parent link="cube_body"/>
        <child  link="inertia_wheel_roll"/>
        <limit effort="200" velocity="1000.0"/>
        <axis xyz="0 0 1"/>
    </joint>

If you launch it again, you should see the red cylinder appear.

Step 3. Make the robot move

We need to include the controller package first in the urdf

    <gazebo>
        <plugin name="gazebo_ros_control" filename="libgazebo_ros_control.so">
            <robotNamespace>/my_moving_cube</robotNamespace>
        </plugin>
    </gazebo>

We’ll add the transmission part to actuate the robot.(NOTICE: the joint_name should be the same as the joint)

    <transmission name="inertia_wheel_roll_joint_trans">
      <type>transmission_interface/SimpleTransmission</type>
      <joint name="inertia_wheel_roll_joint">
        <hardwareInterface>hardware_interface/EffortJointInterface</hardwareInterface>
      </joint>
      <actuator name="inertia_wheel_roll_jointMotor">
        <hardwareInterface>hardware_interface/EffortJointInterface</hardwareInterface>
        <mechanicalReduction>1</mechanicalReduction>
      </actuator>
    </transmission>

Then we create a moving_cube.yaml file under the my_moving_cube_description/config to define parameters for the controller

# .yaml config file
#
# The PID gains and controller settings must be saved in a yaml file that gets loaded
# to the param server via the roslaunch file (moving_cube_control.launch).

my_moving_cube:
  # Publish all joint states -----------------------------------
  # Creates the /joint_states topic necessary in ROS
  joint_state_controller:
    type: joint_state_controller/JointStateController
    publish_rate: 30

  # Effort Controllers ---------------------------------------
  inertia_wheel_roll_joint_velocity_controller:
    type: effort_controllers/JointVelocityController
    joint: inertia_wheel_roll_joint
    pid: {p: 1.0, i: 0.0, d: 0.0}

In the end, you should create one new launch file called moving_cube control.launch under the launch folder to launch the controller

<?xml version="1.0" encoding="UTF-8"?>
<launch>

  <rosparam file="$(find my_moving_cube_description)/config/moving_cube.yaml"
            command="load"/>

  <node name="robot_state_publisher_moving_cube" pkg="robot_state_publisher" type="robot_state_publisher"
        respawn="false" output="screen">
            <param name="publish_frequency" type="double" value="30.0" />
            <param name="ignore_timestamp" type="bool" value="true" />
            <param name="tf_prefix" type="string" value="moving_cube" />
            <remap from="/joint_states" to="/moving_cube/joint_states" />
        </node>


  <node name="controller_spawner" pkg="controller_manager" type="spawner" respawn="false"
        output="screen" args="--namespace=/my_moving_cube
                              joint_state_controller
                              inertia_wheel_roll_joint_velocity_controller">
  </node>

</launch>

Let’s spawn the cube and launch the controller

roslaunch my_moving_cube_descriptiospawn_moving_cube.launch
roslaunch my_moving_cube_description moving_cube_control.launch

Then we can kick the robot and make it move by publishing one topic like this

rostopic pub /my_moving_cube/inertia_wheel_roll_joint_velocity_controller/command std_msgs/Float64 "data: 80.0"

Congratulations! Your cube robot moved a bit!

Edit by: Tony Huang

[irp posts=”9860″ name=”ROS Projects OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part2″]

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part5

Step 0. If you didn’t follow up

Step 1. Create gym environment file

Step 2. Start training

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step #Part4

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part3

Step 1. cube_rl_utils.py

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part2

Step 1. Clone the simulation

Step 2. Create a training package

[ROS Projects] OpenAI with Moving Cube Robot in Gazebo Step-by-Step Part1

Part 1

Step 1. Create a project in ROS Development Studio(ROSDS)

Step 2. Create package

Step 3. Make the robot move

Search

RECENT POSTS

Categories

Pin It on Pinterest