AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

Shao Yulin; Liew Soung Chang; Wang Taotao

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

【24h】

AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

机译：AlphaLeq：序列发现，深增强学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sequences play an important role in many applications and systems. Discovering sequences with desired properties has long been an interesting intellectual pursuit. This article puts forth a new paradigm, AlphaSeq, to discover desired sequences algorithmically using deep reinforcement learning (DRL) techniques. AlphaSeq treats the sequence discovery problem as an episodic symbol-filling game, in which a player fills symbols in the vacant positions of a sequence set sequentially during an episode of the game. Each episode ends with a completely filled sequence set, upon which a reward is given based on the desirability of the sequence set. AlphaSeq models the game as a Markov decision process (MDP) and adapts the DRL framework of AlphaGo to solve the MDP. Sequences discovered improve progressively as AlphaSeq, starting as a novice, and learns to become an expert game player through many episodes of game playing. Compared with traditional sequence construction by mathematical tools, AlphaSeq is particularly suitable for problems with complex objectives intractable to mathematical analysis. We demonstrate the searching capabilities of AlphaSeq in two applications: 1) AlphaSeq successfully rediscovers a set of ideal complementary codes that can zero-force all potential interferences in multi-carrier code-division multiple access (CDMA) systems and 2) AlphaSeq discovers new sequences that triple the signal-to-interference ratio-benchmarked against the well-known Legendre sequence-of a mismatched filter (MMF) estimator in pulse compression radar systems.

机译：序列在许多应用程序和系统中发挥着重要作用。发现具有所需特性的序列长期以来一直是一个有趣的智力追求。本文提出了一种新的范例，Alpha样，使用深度加强学习（DRL）技术来发现算法的所需序列。 AlphaLeq将序列发现问题视为一个epiSodic符号填充游戏，其中玩家在游戏的集中顺序地依次设置序列的空缺位置中的符号。每一情集以完全填充的序列集结束，基于序列集的可取性，给出奖励。 Alpha提比将游戏模型为Markov决策过程（MDP），并适应alphano的DRL框架来解决MDP。被发现的序列逐渐改善为alphaseq，作为新手，并通过许多游戏播放剧集学习成为专家游戏玩家。与数学工具的传统序列施工相比，AlphaLeQ特别适用于复杂物体棘手的复杂物体分析的问题。我们展示了AlphaSeq的搜索能力在两个应用中：1）AlphaSeq成功重新发现一组理想的互补代码，可以零强制零强制在多载波码分割多址（CDMA）系统中的所有潜在干扰和2）AlphaLeQ发现新序列该三倍的信号到干扰比 - 基准与脉冲压缩雷达系统中的众所周知的滤波器（MMF）估计器的错配滤波器（MMF）估计器。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on 》 |2020年第9期| 3319-3333| 共15页
作者
Shao Yulin; Liew Soung Chang; Wang Taotao;
展开▼
作者单位

Chinese Univ Hong Kong Dept Informat Engn Hong Kong Peoples R China;

Chinese Univ Hong Kong Dept Informat Engn Hong Kong Peoples R China;

Chinese Univ Hong Kong Dept Informat Engn Hong Kong Peoples R China|Shenzhen Univ Coll Informat Engn Shenzhen 518061 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Games; Radar; Tools; Multiaccess communication; Machine learning algorithms; Approximation algorithms; Learning systems; AlphaGo; deep reinforcement learning (DRL); Monte Carlo tree search (MCTS); multi-carrier code-division multiple access (MC-CDMA); pulse compression radar;

机译：游戏;雷达;工具;多动态通信;机器学习算法;近似算法;学习系统;alphago;深度加强学习（DRL）;蒙特卡罗树搜索（MCT）;多载波码分割多址（MC-CDMA）;脉冲压缩雷达;

相似文献

外文文献
中文文献
专利

1. Adaptive early classification of temporal sequences using deep reinforcement learning [J] . Knowledge-Based Systems . 2020 ,第Feba29期

机译：使用深度强化学习对时间序列进行自适应早期分类
2. Deep Reinforcement Learning for Sequence-to-Sequence Models [J] . Keneshloo Yaser, Shi Tian, Ramakrishnan Naren, Neural Networks and Learning Systems, IEEE Transactions on . 2020 ,第7期

机译：序列到序列模型的深度加固学习
3. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning [J] . Xu Ming, Wu Jianping, Huang Ling, Journal of Intelligent Transportation Systems . 2020 ,第1a6期

机译：基于关键节点和深度增强学习的网络宽的交通信号控制
4. Extracting Action Sequences from Texts Based on Deep Reinforcement Learning [C] . Wenfeng Feng, Hankz Hankui Zhuo, Subbarao Kambhampati International Joint Conference on Artificial Intelligence . 2018

机译：基于深增强学习的文本中提取动作序列
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Deep Reinforcement Learning for Sequence-to-Sequence Models [O] . Yaser Keneshloo, Tian Shi, Naren Ramakrishnan, 2019

机译：序列到序列模型的深度加固学习
8. REINFORCEMENT-TEST SEQUENCES IN PAIRED-ASSOCIATE LEARNING [R] . Chizuko Izawa, W. K. Estes 1965

机译：双重学习中的加强测试序列

AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅