A Simplified Event-based Impulsive Control Approach for Stable, Efficient, and Robust Locomotion Using Deep Reinforcement Learning

Document Type : Research Article

Authors

Department of Electrical and Computer Engineering, Faculty of Engineering, University of Tehran, Tehran, Iran.

Abstract

Biological evidence indicates that the actuation system in humans and legged animals is characterized by impulsiveness rather than continuity; i.e., control actions are concentrated within a specific phase of the motion cycle (the stance phase), while the rest of the cycle is passive. Based on this observation, we propose a simple event-based impulsive controller to generate walking cycles for legged robots. To improve optimization speed, we parametrize the controller-applied forces as a Gaussian function of time and employ a deep reinforcement learning method to optimize the controller parameters. To further enhance learning speed, an autoencoder is utilized to address the high dimensionality in the state space. Additionally, we employ a three-phase reward-shaping approach to further improve learning speed and achieve better results. In phase one, the reward function focuses on stability and forward motion to learn stable locomotion. In phase two, the reward function is modified to achieve stable locomotion with lower control effort and desired forward velocity. In phase three, the reward function remains the same as in phase two but places more emphasis on forward velocity regulation. The proposed controller, state encoder, and learning process can be implemented on a group of legged robots with actuation at the leg contact point with the ground. In this paper, the proposed approach is tested on a simulated single-legged robot. In addition, the controller robustness is analyzed considering different types of external disturbances. The simulation results indicate the efficacy of the proposed method as a bio-inspired control approach for legged locomotion.

Keywords

Main Subjects


  1. P.Biswal and P. K. Mohanty, “Development of quadruped walking robots: A review,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 2017–2031, 2021.
  2. Y.Farid and F. Ruggiero, “Finite-time disturbance reconstruction and robust fractional-order controller design for hybrid port-hamiltonian dynamics of biped robots,” Robotics and Autonomous Systems, vol. 144, p. 103836, 2021.
  3. P.A. Bhounsule, J. Cortell, and A. Ruina, “Design and control of ranger: an energy-efficient, dynamic walking robot,” in Adaptive Mobile Robotics. World Scientific, 2012, pp. 441–448.
  4. C.Semini, V. Barasuol, T. Boaventura, M. Frigerio, M. Focchi, D. G. Caldwell, and J. Buchli, “Towards versatile legged robots through active impedance control,” The International Journal of Robotics Research, vol. 34, no. 7, pp. 1003–1020, 2015.
  5. B.Vanderborght, B. Verrelst, R. Van Ham, M. Van Damme, D. Lefeber, B. M. Y. Duran, and P. Beyl, “Exploiting natural dynamics to reduce energy consumption by controlling the compliance of soft actuators,” The International Journal of Robotics Research, vol. 25, no. 4, pp. 343–358, 2006.
  6. J.Zhang, F. Gao, X. Han, X. Chen, and X. Han, “Trot gait design and cpg method for a quadruped robot,” Journal of Bionic Engineering, vol. 11, no. 1, pp. 18–25, 2014.
  7. N.Van der Noot, A. J. Ijspeert, and R. Ronsse, “Bio-inspired controller achieving forward speed modulation with a 3d bipedal walker,” The International Journal of Robotics Research, vol. 37, no. 1, pp. 168–196, 2018.
  8. R.Nasiri, M. Khoramshahi, and M. N. Ahmadabadi, “Design of a nonlinear adaptive natural oscillator: Towards natural dynamics exploitation in cyclic tasks,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2016, pp. 3653–3658.
  9. M.Khoramshahi, R. Nasiri, M. Shushtari, A. J. Ijspeert, and M. N. Ahmadabadi, “Adaptive natural oscillator to exploit natural dynamics for energy efficiency,” Robotics and Autonomous Systems, vol. 97, pp. 51–60, 2017.
  10. A.J. Ijspeert, A. Crespi, D. Ryczko, and J.-M. Cabelguen, “From swimming to walking with a salamander robot driven by a spinal cord model,” science, vol. 315, no. 5817, pp. 1416–1420, 2007.
  11. H.Lee, E. J. Rouse, and H. I. Krebs, “Summary of human ankle mechanical impedance during walking,” IEEE journal of translational engineering in health and medicine, vol. 4, pp. 1–7, 2016.
  12. R.Nasiri, M. Khoramshahi, M. Shushtari, and M. N. Ahmadabadi, “Adaptation in variable parallel compliance: Towards energy efficiency in cyclic tasks,” IEEE/ASME Transactions on Mechatronics, vol. 22, no. 2, pp. 1059–1070, 2016.
  13. R.Nasiri, A. Ahmadi, and M. N. Ahmadabadi, “Realization of nonlinear adaptive compliance: Towards energy efficiency in cyclic tasks,” in 2019 7th International Conference on Robotics and Mechatronics (ICRoM). IEEE, 2019, pp. 175–180.
  14. H.Geyer and H. Herr, “A muscle-reflex model that encodes principles of legged mechanics produces human walking dynamics and muscle activities,” IEEE Transactions on neural systems and rehabilitation engineering, vol. 18, no. 3, pp. 263–273, 2010.
  15. A.M. Wilson, J. C. Watson, and G. A. Lichtwark, “A catapult action for rapid limb protraction,” Nature, vol. 421, no. 6918, pp. 35–36, 2003.
  16. J.Yang, J. Fung, M. Edamura, R. Blunt, R. Stein, and H. Barbeau, “Hreflex modulation during walking in spastic paretic subjects,” Canadian Journal of Neurological Sciences, vol. 18, no. 4, pp. 443–452, 1991.
  17. Z.Miranda, A. Pham, G. Elgbeili, and D. Barthélemy, “H-reflex modulation preceding changes in soleus emg activity during balance perturbation,” Experimental brain research, vol. 237, no. 3, pp. 777– 791, 2019.
  18. M.Rayati, R. Nasiri, and M. N. Ahmadabadi, “Improving muscle force distribution model using reflex excitation: Toward a model-based exoskeleton torque optimization approach,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 31, pp. 720– 728, 2022.
  19. M.Meinders, A. Gitter, and J. M. Czerniecki, “The role of ankle plantar flexor muscle work during walking.” Scandinavian journal of rehabilitation medicine, vol. 30, no. 1, pp. 39–46, 1998.
  20. M.Srinivasan and A. Ruina, “Computer optimization of a minimal biped model discovers walking and running,” Nature, vol. 439, no. 7072, pp. 72–75, 2006.
  21. R.Nasiri, A. Zare, O. Mohseni, M. J. Yazdanpanah, and M. N. Ahmadabadi, “Concurrent design of controller and passive elements for robots with impulsive actuation systems,” Control Engineering Practice, vol. 86, pp. 166–174, 2019.
  22. D.Wahrmann, Y. Wu, F. Sygulla, A.-C. Hildebrandt, R. Wittmann, P. Seiwald, and D. Rixen, “Time-variable, event-based walking control for biped robots,” International Journal of Advanced Robotic Systems, vol. 15, no. 2, p. 1729881418768918, 2018.
  23. K.A. Hamed, J. Kim, and A. Pandala, “Quadrupedal locomotion via event-based predictive control and qp-based virtual constraints,” IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4463–4470, 2020.
  24. J.Lee and J. H. Kim, “A comparative study on the l 1 optimal event-based method for biped walking on rough terrains,” IEEE Access, vol. 8, pp. 96 304–96 315, 2020.
  25. Y.Lee, H. Lee, J. Lee, and J. Park, “Toward reactive walking: Control of biped robots exploiting an event-based fsm,” IEEE Transactions on Robotics, vol. 38, no. 2, pp. 683–698, 2021.
  26. F.Giardina and F. Iida, “Efficient and stable locomotion for impulse-actuated robots using strictly convex foot shapes,” IEEE Transactions on Robotics, vol. 34, no. 3, pp. 674–685, 2018.
  27. S.Mochiyama and T. Hikihara, “Impulsive torque control of biped gait with power packets,” Nonlinear Dynamics, vol. 102, no., pp. 951–963, 2020.
  28. K.Zhang and E. Braverman, “Event-triggered impulsive control for nonlinear systems with actuation delays,” IEEE Transactions on Automatic Control, pp. 1–1, 2022.
  29. J.Weng, E. Hashemi, and A. Arami, “Natural walking with musculoskeletal models using deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 4156–4162, 2021.
  30. X.B. Peng, G. Berseth, K. Yin, and M. Van De Panne, “Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning,” ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1–13, 2017.
  31. T.Li, N. Lambert, R. Calandra, F. Meier, and A. Rai, “Learning generalizable locomotion skills with hierarchical reinforcement learning,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 413–419.
  32. H.M. Clayton, H. C. Schamhardt, M. A. Willemen, J. L. Lanovaz, and G. R. Colborne, “Kinematics and ground reaction forces in horses with superficial digital flexor tendinitis,” American journal of veterinary research, vol. 61, no. 2, pp. 191–196, 2000.
  33. N.Ogihara, E. Hirasaki, H. Kumakura, and M. Nakatsukasa, “Ground reaction-force profiles of bipedal walking in bipedally trained Japanese monkeys,” Journal of human evolution, vol. 53, no. 3, pp. 302–308, 2007.
  34. H.T. Lin and B. A. Trimmer, “The substrate as a skeleton: ground reaction forces from a soft-bodied legged animal,” Journal of Experimental Biology, vol. 213, no. 7, pp. 1133–1142, 2010.
  35. S.W. Lipfert, Kinematic and dynamic similarities between walking and running. Kovaˇc Hamburg, 2010.
  36. S.Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in International Conference on Machine Learning. PMLR, 2018, pp. 1587–1596.
  37. A.D. Laud, Theory and application of reward shaping in reinforcement learning. University of Illinois at Urbana-Champaign, 2004.
  38. R.Nasiri, H. Dinovitzer, and A. Arami, “A unified gait phase estimation and control of exoskeleton using virtual energy regulator (ver),” in 2022 International Conference on Rehabilitation Robotics (ICORR). IEEE, 2022, pp. 1–6.
  39. S.Kajita, K. Kaneko, K. Harada, F. Kanehiro, K. Fujiwara, and H. Hirukawa, “Biped walking on a low friction floor,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), vol. 4. IEEE, 2004, pp. 3546– 3552.
  40. Zhao, Yongyong, Jinghua Wang, Guohua Cao, Yi Yuan, Xu Yao, and Luqiang Qi. "Intelligent control of multilegged robot smooth motion: a review." IEEE Access 11 (2023): 86645-86685.
  41. Kotha, Swapnil Saha, Nipa Akter, Sarafat Hussain Abhi, Sajal Kumar Das, Md Robiul Islam, Md Firoj Ali, Md Hafiz Ahamed et al. "Next generation legged robot locomotion: A review on control techniques." Heliyon (2024).
  42. Gao, Yong, Wu Wei, Xinmei Wang, Dongliang Wang, Yanjie Li, and Qiuda Yu. "Trajectory tracking of multi-legged robot based on model predictive and sliding mode control." Information Sciences 606 (2022): 489-511.
  43. Morimoto, Jun, and Christopher Atkeson. "Minimax differential dynamic programming: An application to robust biped walking." Advances in neural information processing systems 15 (2002).
  44. Peng, Xue Bin, Erwin Coumans, Tingnan Zhang, Tsang-Wei Lee, Jie Tan, and Sergey Levine. "Learning agile robotic locomotion skills by imitating animals." arXiv preprint arXiv:2004.00784 (2020). Alexander, R. McNeill. Principles of animal locomotion. Princeton university press, 2003.
  45. Fu, Zipeng, Ashish Kumar, Jitendra Malik, and Deepak Pathak. "Minimizing energy consumption leads to the emergence of gaits in legged robots." arXiv preprint arXiv:2111.01674 (2021).
  46. Hutter M, Gehring C, Jud D, Lauber A, Bellicoso CD, Tsounis V, Hwangbo J, Bodie K, Fankhauser P, Bloesch M, Diethelm R. Anymal-a highly mobile and dynamic quadrupedal robot. In2016 IEEE/RSJ international conference on intelligent robots and systems (IROS) 2016 Oct 9 (pp. 38-44). IEEE.