Towards High Level Skill Learning: Learn to Return Table Tennis Ball Using Monte-Carlo Based Policy Gradient Method

机译：迈向高水平技能学习：使用基于蒙特卡洛的政策梯度法学习归还乒乓球

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Deep learning has achieved a great success in both visual and acoustic recognition and classification tasks. The accuracy of many state-of-the-art methods have surpassed that of human beings. However, in the field of robotics, it remains to be a big challenge for a real robot to master a high-level skill using deep learning methods, even though human can easily learn the task from demonstration, imitation and practice. Compared to Go and Atari games, this kind of tasks is usually continuous in both state space and action space, which makes value based reinforcement learning methods unavailable. Making a robot learn to return a ball to a desired point in table tennis is such a typical task. It would be a promising step if a robot can learn to play table tennis without the exact knowledge of the models in this sport just as human players do. In this paper, we consider such a kind of motion decision skill learning, a one-step decision making process, and give a Monte-Carlo based reinforcement learning method in the framework of Deep Deterministic Policy Gradient. Then we apply this method in robotic table tennis and test it on two tasks. The first one is to return balls to a desired point first, and the second one is to return balls to randomly selected landing points. The experimental results demonstrate that the trained policy can successfully return balls of random motion state to both a designated point and randomly selected landing points with high accuracy.

机译：深度学习在视觉和听觉识别以及分类任务方面都取得了巨大的成功。许多最先进的方法的准确性已经超过了人类。然而，在机器人技术领域，即使人类可以通过演示，模仿和练习轻松地学习任务，对于真正的机器人来说，使用深度学习方法来掌握高级技能仍然是一个巨大的挑战。与Go和Atari游戏相比，这种任务通常在状态空间和动作空间都是连续的，这使得基于价值的强化学习方法不可用。使机器人学会将球传回乒乓球中的期望点是这样的典型任务。如果机器人能够像人类运动员一样，在不完全了解这项运动的模型的情况下学习打乒乓球，那将是有希望的一步。在本文中，我们考虑了这种运动决策技能学习，一步一步的决策过程，并在深度确定性策略梯度框架内给出了基于蒙特卡洛的强化学习方法。然后，我们将此方法应用于自动乒乓球并在两个任务上对其进行测试。第一个是首先将球返回到期望的点，第二个是将球返回到随机选择的着陆点。实验结果表明，经过训练的策略可以成功地将随机运动状态的球成功地返回到指定点和随机选择的着陆点。

著录项

来源
《IEEE International Conference on Real-time Computing and Robotics》|2018年|34-41|共8页
会议地点 Kandima(MV)
作者
Yifeng Zhu; Yongsheng Zhao; Lisen Jin; Jun Wu; Rong Xiong;
展开▼
作者单位

Zhejiang University State Key Laboratory of Industrial Control and Technology Hangzhou 310027 P. R. China;

Binhai Industrial Technology Research Institute of Zhejiang University Tian 300457 P. R. China;

Zhejiang University State Key Laboratory of In;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Robots; Reinforcement learning; Sports; Task analysis; Games; Monte Carlo methods; Decision making;

机译：机器人；强化学习；体育;任务分析；游戏；蒙特卡洛方法；做决定;

相似文献

外文文献
中文文献
专利

1. Research on real – time tracking of table tennis ball based on machine learning with low-speed camera [J] . Yun-Feng Ji, Jian-Wei Zhang, Zhi-hao Shi, Systems Science & Control Engineering . 2018,第1期

机译：基于低速相机机器学习的乒乓球实时跟踪研究。
2. Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation [J] . Haeun Yoo, Boeun Kim, Jong Woo Kim, Computers & Chemical Engineering . 2021,第Jana4期

机译：基于跨越蒙特 - 卡洛深度确定性政策梯度的批量学习基于批处理流程的最优控制
3. Racket Control for a Table Tennis Robot to Return a Ball [J] . Chunfang LIU, Yoshikazu HAYAKAWA, Akira NAKASHIMA SICE Journal of Control, Measurement, and System Integration (SICE JCMSI) . 2013,第4期

机译：乒乓球机器人送球的球拍控制
4. Towards High Level Skill Learning: Learn to Return Table Tennis Ball Using Monte-Carlo Based Policy Gradient Method [C] . Yifeng Zhu, Yongsheng Zhao, Lisen Jin, IEEE International Conference on Real-time Computing and Robotics . 2018

机译：走向高水平技能学习：使用基于Monte-Carlo的政策梯度方法学习返回乒乓球
5. Communal learning versus individual learning: An exploratory convergent parallel mixed-method study to describe how young African American novice programmers learn computational thinking skills in an informal learning environment. [D] . Hatley, Leshell April Denise. 2016

机译：社区学习与个人学习：一项探索性的融合并行混合方法研究，描述了年轻的非洲裔美国新手程序员如何在非正式的学习环境中学习计算思维技能。
6. Kinematic Comparisons of the Shakehand and Penhold Grips in Table Tennis Forehand and Backhand Strokes when Returning Topspin and Backspin Balls [O] . Rui Xia, *, Boyi Dai, 2020

机译：返回Topspin和Backspin Balls时乒乓球正手和反手中握把的摇晃和贴搏夹具的运动比较
7. Effect of Specific Drills through Table Tennis Ball Feeding Machine on Selected Skill Performance Variables of Non- Table Tennis Players [O] . Dr. Koushik Bhowmik 2018

机译：特定钻头通过乒乓球送料机对非乒乓球运动员所选技能性能变量的影响

Towards High Level Skill Learning: Learn to Return Table Tennis Ball Using Monte-Carlo Based Policy Gradient Method

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅