WP5: Sensory-motor control and learning

Objectives

This work package has three main objectives:

to implement adaptive sensory-motor control and learning strategies which take advantage of the robots' natural physical dynamics, in order to achieve robust motor skills, in particular the ability to perform efficient and adaptive locomotion and simple discrete movements in unknown environments.
to develop strategies for dealing with voluntary and involuntary morphosis.
to develop theoretical understanding on how morphology can ease control and learning.

We address these objectives through three main phases of the project which are explained in the next sections.

Phase 1 - Design and development of motor control architecture (M1-12)

1.1. Locomorph control and learning framework

As the first step we worked towards a general Locomorph control and learning framework. The architecture of our control and learning framework is shown in Figure 1.

control_and_learning_framework

Figure.1: General architecture of sensory motor control and learning framework.

The goal of our framework is to find the optimal control given the morphology of the robot and the environment within it has to perform locomotion or any other task or - even more challenging - to find the optimal control and morphology with only the environment given. For the optimizer we are evaluating a variety of algorithms including Particle Swarm Optimization (PSO) (Kennedy and Eberhart 1995), Perturbation Stochastic Approximation (SPSA) (Spall 1992) and Evolutionary Algorithms. To evaluate and optimize the results, a component is needed in the optimization framework to measure and monitor the performance. This is the job of the supervisor. There is also a human in the loop for higher level supervisions and also tuning optimization methods.

1.2. Central pattern generator control

The underlying neural controller used in most of our studies is the central pattern generator (CPG) (Pouya et al. 2010). Here we are giving a short reminder since the CPG concept is a central component of our control and learning framework.

Central pattern generators are a specific type of neural networks found in the spinal cords of vertebrates (Grillner and Wallen 1985). They produce rhythmic patterned signals used to generate periodic movements like locomotion gaits. A CPG network can intrinsically produce oscillatory outputs without having any periodic input signal, meaning it does not need sensory feedbacks or periodic commands from the higher-parts of the brain to generate coordinated rhythmic activity. However, feedback and descending commands are of course essential for shaping and modulating that rhythmic activity.

In robotics CPGs have been modeled by coupled oscillators that are used to produce rhythmic gaits in robots (Ijspeert 2001, Righetti and Ijspeert 2006, Righetti and Ijspeert 2008, Ijspeert 2008). CPGs are well suited for coevolution studies. Since a CPG is a network of distributed neurons organized as coupled oscillatory sub-networks (oscillators), they can easily be mapped to the morphology of the robot. When some parts of the robot are added or removed, the CPG is adapted by adding or removing the oscillators corresponding to the moving parts. This preserves the other oscillators and thus conserves most of the properties of the initial network. The coordinated rhythmic patterns generated by a CPG produce periodic gaits. Those are the reasons that make CPGs a perfect candidate for the control of our robots.

In our studies an oscillator is typically implemented for each active degree of freedom (DOF). To allow synchronization, these oscillators are then coupled together. The servos controlling the DOFs are governed by the following equations. This controller architecture was designed at BioRob e.g. for the Roombots (Sproewitz et al. 2010, Pouya et al. 2010). Each servo is associated with one oscillator and equations for an oscillator are given below.

cpg_equation

Three state variables are defined for each of the oscillator, Ф_i encodes the phase, r_i encodes the amplitude and x_i encodes the offset of the oscillator. ψ_ij is the phase bias of the coupling between i^th and j^th oscillator. Compared with the original controller an additional state for the offset, x_i, is added to the dynamical system.

The frequency of the system ν is typically used as a fixed parameter that is not optimized but we also performed studies where the frequency is a function of the phase variable. The weight of the couplings (ω_ij) are fixed as well. a and b are positive constants which only affect the rise time of the state variables. The amplitude R_i, the phase bias ψ_ij and the offset X_i are typically open variables that are subject to optimization. The servo driving functions which are given in Equation (4)-(6) support two types of basic movements, namely oscillation and rotation.

Phase 2 - Control and learning strategies for different morphologies (M04-M36)

With our Locomorph control and learning framework we performed several studies evaluating the coevolution of morphology and control, optimization of control architectures towards safer, faster and more robust locomotion and life-long learning.

2.1. Coevolution of morphology and control

Not every robot morphology is optimal for control or can even be controlled. Our coevolution studies have been performed to address one of the main objectives of the Locomorph project – the question for the theoretical understanding on how morphology can ease control and learning. To address this question we carried out several studies. For the optimization of morphology and control we are using Evolutionary Algorithms (EA). EAs are methods inspired by the evolution of the genome in living organisms where new organisms with varying properties can be created in every generation since they inherit a mix of the DNA of their parents that can be further modified by so-called mutations. Both studies were carried out in simulation using the Webots simulator (Michel 2004).

In the studies we were using our central pattern generator approach. For each active degree of freedom we implemented a phase oscillator. The coupling of these individual oscillators was subject to optimization by the EA.

The first study (Aydin 2010) addressed the question of how we can optimize robot morphology and control to gain robots capable of fast, energy efficient and safe locomotion comparing compliant and non-compliant modular robots. This study was carried out using our Roombots modular robot setup (Sproewitz et al. 2010). Figure 2 is showing a single Roombots module with three compliant legs. To code the genome of the robots we studied the so-called L-systems that initially have been developed to model the growth process of the organisms (Floreano and Mattiussi, 2008). We were able to evolve robots performing efficient locomotion and could prove that the addition of compliant structures are beneficial for the robot locomotion.

roombots_coevolution

Figure.2: A Roombots module with three passive legs. Passive legs are rendered with a red connection face and a sphere. Picture taken from (Aydin 2010).

The second study (Lapin 2011) made largely use of the tools that had been developed in the first work. However to better address the ideas of the Locomorph project we decided to shift the focus of our attention more towards legged robots. See Figure 3 for several example robot configurations. Furthermore we generalized the underlying selection methods for finding the optimal robot morphologies and control networks. As in previous studies the fitness criteria included the speed and the energy efficiency of robot locomotion but while previous studies typically concentrated on finding morphologies and control parameters for allowing the robot to move as fast as possible on a straight line we concentrated in this project as well on the steering capacity of the robot. To make a robot useful, manoeuvrability is indeed as important as energy efficiency and maximal speed.

coevolution_legged_robots

Figure.3: Example robots from coevolution study. Picture taken from (Lapin 2011).

2.2. Application of our control and learning framework on Locomorph robots

An importaint part of our work in work package 5 is to support the development of the Locomorph robots presented in work package 2 and work package 3 and to find efficient controls. The goal is to facilitate the robot development by pre-evaluating possible robot designs in simulation with our control and learning framework and to optimize control architectures that then can easily be transfered to the real hardware. See Figure 4 for two different models of the UZH1 design created in Solid Works (left) and the model simulated in Webots (right).

We successfully applied our control and learning framework both to the Locokit and to several iterations of quadruped robot designs.

UZH1_webots_vs_solid_works

Figure.4: Example of transfer of robot model from design software to simulation environment. (Left) UZH1 robot design in Solid Works. (Right) UHZ1 robot model forsimulation in Webots.

2.3. Studies on the spring-loaded inverted pendulum

One of the main objectives of the Locomorph project is to gain a deeper understanding how control and learning strategies can take advantage of the robots’ natural physical dynamics. To address this objective we decided to study the spring-loaded inverted pendulum (SLIP) in more detail.

The SLIP model is a simple physical model to analyze data of human and animal running. By means of that, different quantities observed experimentally are related to each other and it appears that there is great similarity in running of different species. Furthermore, the SLIP model is used to synthesize stable running patterns in a forward dynamic simulation. The occurrence of passively stable running patterns is remarkable. It appears due to the intermittent ground contact (non-holonomic stability; Ruina 1998). From the SLIP we could thus learn how to build robots that can perform passively stable running and how to make use of the robots’ compliance for more robust, stable and energy-efficient locomotion.

Phase 3 - Motor control strategies for dealing with morphosis (M13-M42)

3.1. Effects of tail-loss in long-tailed lizard's locomotion

One of our major goals in the scope of Locomorph is to study biological systems' strategies in dealing with changes in morphology and explore how these strategies can be used in robotics. For this study, we used an amputated model of the long-tailed lizard. The tail was removed at a position close to the body (Amputation plane in Fig. 3.1). Our goal then is to use optimization algorithms in a continuous space to provide better insight for the maximal performance of the model. In particular we are interested to compare the simulation results to the data on the real animals provided with WP4. We made extensive use of the standard PSO (particle swarm optimization); the same method which is used in the proposed control learning framework. For the tness we measured only the speed of forward locomotion. In total, 68 dierent PSO optimizations were performed. Initial exploration for the number of iterations needed showed that 150 iterations for the intact model and 200 iterations for the amputated model were enough to ensure convergence. In both cases the number of particles was 50. This means that 7500 and 10000 individuals respectively were explored in each optimization run.

Illustration of the amputation plane of the long-tailed lizard model.

Figure.5: Illustration of the amputation plane of the long-tailed lizard model.

3.2. Dealing with Inertial Changes via Robust and Compliant Control Design

In this study [Pouya 2013] we focus on situations where the robot experiences involuntary changes in its body particularly in its limbs' inertia. Inspired from its biological counterparts we are interested in enabling the robot to adapt its motor control to the new system dynamics. This is in contrast to the recovery via life-long learning (explained in the next section). To reach this goal, we propose two dierent control strategies and compare their performance when handling these modications. Our results show substantial im- provements in adaptivity to body changes when the robot is aware of its new dynamics and can exploit this knowledge in synthesising new motor control.

The step-wise process of developing theoretical models, control and hardware in parallel (from very simplified to very detailed models).

Figure.6: The step-wise process of developing theoretical models, control and hardware in parallel (from very simplified to very detailed models).

3.3. Dealing with Involuntary Morphosis via Online Learning

Another approach to deal with the change in the robot morphology is to use life-long learning to reoptimize the joint position commands for the new system. We have used this idea for dierent robots in simulation and hardware among them Yamore and Roombots modular robots. In particular, in [Christensen 2010], this approach was validated on a simulated quadruped to triped morphosis using Roombots modules. The learning is based on a stochastic approximation method, SPSA, which optimizes the parameters of coupled oscillators used to generate periodic actuation patterns. The strategy is implemented in a distributed fashion, based on a globally shared reward signal, but otherwise utilizing local communication only. In a physics-based simulation of modular Roombots robots we experiment with online learning of gaits and study the eects of module failures in the legs or the spine of the robot.

3.4. Online Voluntary Morphosis; Spine Actuation and Compliance

In more rapid time-scale for voluntary morphosis, we investigate [Pouya 2012] the challenge of Online Morphosis where the morphology of the animal/robot can be adjusted during run time according to the objective (task or environment dependant). We explore systematically how morphometrical and biomechanical features of the spinal segment inuence the kine- matics and dynamics of the robot natural behavior in particular quantied with speed, cost of transport and stability. To this end, we introduce novel dynamics model based on existing dynamics modeling approaches [Remy 2011] that allow detailed studies on the role of a compliant spine in quadruped locomotion. We use this model to address our key questions on locomotion control and the eect of robot morphologies by performing extensive sets of experiments. Our goal in this work is to systematically study the eect of morphological and control parameters of a robot on its behavior based on the detailed physical model and the corresponding analytical tools.

Figure.7: Snapshots of a sample gait pattern (from left to right) of the developed quadruped model with actuated spine DOF. This sample shows one of the solutions extracted as a bounding gait pattern by the optimization methods. Dierent sequences of motion can be extracted through the optimization..

Bibliography

Aydin E. (2010). Co-evolution of Morphology and Control, Master thesis at Biorobotics Laboratory, Ecole Polytechnique Federal de Lausanne (2010).

Floreano, D., and Mattiussi, C. (2008). Bio-inspired artificial intelligence. New York: McGraw-Hill.

Grillner, S. and Wallen, P. (1985). Central pattern generators for locomotion, with special reference to vertebrates. Annual Review of Neuroscience, vol. 8, no. 1, pp. 233-261.

Ijspeert, A. J. (2001). A connectionist central pattern generator for the aquatic and terrestrial gaits of a simulated salamander. Biological Cybernetics, vol. 84, no. 5, pp. 331-348.

Ijspeert, A. J. (2008). Central pattern generators for locomotion control in animals and robots: a review, Neural Networks, pp. 642-653.

Kennedy, J. and Eberhart, R. (1995). Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks. vol. 4, pp. 1942–1948.

Lapin K. (2010). Co-evolution of Morphology, Control and Behavior, Master thesis at Biorobotics Laboratory, Ecole Polytechnique Federal de Lausanne (2011).

Michel, O. (2004). Webots: Professional mobile robot simulation. Journal of Advanced Robotics Systems, 1 (1), 39-42. Available from http://www.ars-journal.com/International-Journal-of-Advanced-Robotic-Systems/Volume-1/39-42.pdf

Pouya, S., van den Kieboom, J., Sproewitz, A. and Ijspeert, A.J. (2010). Automatic Gait Generation in Modular Robots: to Oscillate or to Rotate? that is the question," in Proceedings of IROS 2010.

Righetti, L., and Ijspeert, A. J. (2006). Programmable Central Pattern Generators: an application to biped locomotion control. In Proceedings of the 2006 IEEE International Conference on Robotics and Automation.

Righetti, L. and Ijspeert, A. J. (2008) Pattern generators with sensory feedback for the control of quadruped locomotion. Proceedings of the 2008 IEEE International Conference on Robotics and Automation (ICRA 2008), Pasadena, May 19-23.

Ruina, A. (1998). Nonholonomic stability aspects of piecewise holonomic systems. Reports on mathematical physics, vol. 42, no. 1-2, pp. 91-100.

Sproewitz, A., Pouya, S., Bonardi, S., van den Kieboom, J., Moeckel, R., Billard, A., et al. (2010). Roombots: Reconfigurable Robots for Adaptive Furniture. IEEE Computational Intelligence Magazine, special issue on "Evolutionary and developmental approaches to robotics".

Spall, J C. (1992). Multivariate stochastic approximation using a simultaneous perturbation gradient approximation, IEEE Transactions on Automatic Control, pp. 332-341.

Pouya, S., Eckert, P., Sproewitz, A., Moeckel, R., Ijspeert, A.J., Motor Control Adaptation to Changes in Robot Body Dynamics for a Complaint Quadruped Robot, The 2nd International Conference on Biomimetic and Biohybrid Systems (Living Machines 2013), London, UK, July 2013 (submitted).

Christensen, D.J., Sproewitz, A., Ijspeert, A., Distributed Online Learning of Central Pattern Generators in Modular Robots, in Proceedings of the 11th International Conference on Simulation of Adaptive Behavior (SAB2010), pp. 402-412, Paris, France, August 2010.

Remy, D., Optimal Exploitation of Natural Dynamics in Legged Locomotion, PhD thesis, ETHZ, 2011.

Pouya, S., Khodabakhsh, M., Moeckel, R., Ijspeert, A. J., Role of Spine Compliance and Actuation in the Bounding Performance of Quadruped Robot, 7th Dynamic Walking Conference, USA, May 2012.

S. Pouya, P. Eckert, A. Sproewitz, R. Moeckel, A. J. Ijspeert, Motor Control Adaptation to Changes in Robot Body Dynamics for a Complaint Quadruped Robot, The 2nd International Conference on Biomimetic and Biohybrid Systems (Living Machines 2013), London, UK, July 2013.

D.J. Christensen, A. Sproewitz, A. J. Ijspeert, Distributed Online Learning of Central Pattern Generators in Modular Robots, in Proceedings of the 11th International Conference on Simulation of Adaptive Behavior (SAB2010), pp. 402-412, Paris, France, August 2010.

Locomorph