公共文化服务平台

共 2 条记录，以下是 1-2

全选清除导出

排序方式：

State-chain sequential feedback reinforcement learning for path planning of autonomous mobile robots被引量：5: 2013年; This paper deals with a new approach based on Q-learning for solving the problem of mobile robot path planning in complex unknown static environments.As a computational approach to learning through interaction with the environment,reinforcement learning algorithms have been widely used for intelligent robot control,especially in the field of autonomous mobile robots.However,the learning process is slow and cumbersome.For practical applications,rapid rates of convergence are required.Aiming at the problem of slow convergence and long learning time for Q-learning based mobile robot path planning,a state-chain sequential feedback Q-learning algorithm is proposed for quickly searching for the optimal path of mobile robots in complex unknown static environments.The state chain is built during the searching process.After one action is chosen and the reward is received,the Q-values of the state-action pairs on the previously built state chain are sequentially updated with one-step Q-learning.With the increasing number of Q-values updated after one action,the number of actual steps for convergence decreases and thus,the learning time decreases,where a step is a state transition.Extensive simulations validate the efficiency of the newly proposed approach for mobile robot path planning in complex environments.The results show that the new approach has a high convergence speed and that the robot can find the collision-free optimal path in complex unknown static environments with much shorter time,compared with the one-step Q-learning algorithm and the Q(λ)-learning algorithm.; Xin MAYa XUGuo-qiang SUNLi-xia DENGYi-bin LI; 关键词：Q-LEARNING

移动机器人路径规划强化学习的初始化被引量：28: 2012年; 针对现有机器人路径规划强化学习算法收敛速度慢的问题,提出了一种基于人工势能场的移动机器人强化学习初始化方法.将机器人工作环境虚拟化为一个人工势能场,利用先验知识确定场中每点的势能值,它代表最优策略可获得的最大累积回报.例如障碍物区域势能值为零,目标点的势能值为全局最大.然后定义Q初始值为当前点的立即回报加上后继点的最大折算累积回报.改进算法通过Q值初始化,使得学习过程收敛速度更快,收敛过程更稳定.最后利用机器人在栅格地图中的路径对所提出的改进算法进行验证,结果表明该方法提高了初始阶段的学习效率,改善了算法性能.; 宋勇李贻斌李彩虹; 关键词：移动机器人人工势能场路径规划

全选清除导出

共1页<1>

国家自然科学基金(61105100)

文献类型

领域

主题

机构

作者

传媒

年份

用户反馈

国家自然科学基金(61105100)

文献类型

领域

主题

机构

作者

传媒

年份

用户登录

用户反馈