登录    注册    忘记密码    使用帮助

详细信息

A numerical simulation of target-directed swimming for a three-link bionic fish with deep reinforcement learning  ( SCI-EXPANDED收录 EI收录)   被引量:5

文献类型:期刊文献

英文题名:A numerical simulation of target-directed swimming for a three-link bionic fish with deep reinforcement learning

作者:Zhu, Yi[1];Pang, Jian-Hua[2]

机构:[1]Guangdong Ocean Univ, Shenzhen Inst, Shenzhen, Peoples R China;[2]Guangdong Ocean Univ, Zhanjiang, Peoples R China

年份:2023

卷号:237

期号:11

起止页码:2450

外文期刊名:PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE

收录:SCI-EXPANDED(收录号:WOS:000774072100001)、、EI(收录号:20221411886179)、Scopus(收录号:2-s2.0-85127449569)、WOS

基金:Y. Zhu acknowledges Shenzhen Institute of Guangdong Ocean University and Dalian Maritime University during the pursuit this study. This work was partially supported by the Australian Research Council.

语种:英文

外文关键词:immersed boundary-lattice Boltzmann method; deep recurrent Q-network; reinforcement learning; target-directed swimming; three-link fish; bionic fish

外文摘要:An accurate and robust feedback control system is of great importance for bionic robotic fish to perform complex tasks. This work presents a numerical study of target-directed swimming for a three-link bionic fish with a feedback control system based on deep reinforcement learning (DRL). The simulation is achieved by using a hybrid method of the DRL method and the immersed boundary-lattice Boltzmann method (IB-LBM). This framework makes use of the high computational efficiency of the IB-LBM for the generation of massive motion data needed in the DRL training. The fish is first trained to swim towards a static target from random orientation and distance. The only information available to the fish is its orientation and distance from the target. It learns an accurate and subtle control policy after the training. Then the control policy is applied to a moving target. Even though the fish encounters some situations that never happened in the training, it can choose appropriate actions to follow the target in close proximity and automatically adapt its velocity with the velocity of the target. Those simulations demonstrate the accuracy, robustness of the control method, and its ability to adapt to new situations.

参考文献:

正在载入数据...

版权所有©广东海洋大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心