...
首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >AlphaSeq: Sequence Discovery With Deep Reinforcement Learning
【24h】

AlphaSeq: Sequence Discovery With Deep Reinforcement Learning

机译:AlphaLeq:序列发现,深增强学习

获取原文
获取原文并翻译 | 示例

摘要

Sequences play an important role in many applications and systems. Discovering sequences with desired properties has long been an interesting intellectual pursuit. This article puts forth a new paradigm, AlphaSeq, to discover desired sequences algorithmically using deep reinforcement learning (DRL) techniques. AlphaSeq treats the sequence discovery problem as an episodic symbol-filling game, in which a player fills symbols in the vacant positions of a sequence set sequentially during an episode of the game. Each episode ends with a completely filled sequence set, upon which a reward is given based on the desirability of the sequence set. AlphaSeq models the game as a Markov decision process (MDP) and adapts the DRL framework of AlphaGo to solve the MDP. Sequences discovered improve progressively as AlphaSeq, starting as a novice, and learns to become an expert game player through many episodes of game playing. Compared with traditional sequence construction by mathematical tools, AlphaSeq is particularly suitable for problems with complex objectives intractable to mathematical analysis. We demonstrate the searching capabilities of AlphaSeq in two applications: 1) AlphaSeq successfully rediscovers a set of ideal complementary codes that can zero-force all potential interferences in multi-carrier code-division multiple access (CDMA) systems and 2) AlphaSeq discovers new sequences that triple the signal-to-interference ratio-benchmarked against the well-known Legendre sequence-of a mismatched filter (MMF) estimator in pulse compression radar systems.
机译:序列在许多应用程序和系统中发挥着重要作用。发现具有所需特性的序列长期以来一直是一个有趣的智力追求。本文提出了一种新的范例,Alpha样,使用深度加强学习(DRL)技术来发现算法的所需序列。 AlphaLeq将序列发现问题视为一个epiSodic符号填充游戏,其中玩家在游戏的集中顺序地依次设置序列的空缺位置中的符号。每一情集以完全填充的序列集结束,基于序列集的可取性,给出奖励。 Alpha提比将游戏模型为Markov决策过程(MDP),并适应alphano的DRL框架来解决MDP。被发现的序列逐渐改善为alphaseq,作为新手,并通过许多游戏播放剧集学习成为专家游戏玩家。与数学工具的传统序列施工相比,AlphaLeQ特别适用于复杂物体棘手的复杂物体分析的问题。我们展示了AlphaSeq的搜索能力在两个应用中:1)AlphaSeq成功重新发现一组理想的互补代码,可以零强制零强制在多载波码分割多址(CDMA)系统中的所有潜在干扰和2)AlphaLeQ发现新序列该三倍的信号到干扰比 - 基准与脉冲压缩雷达系统中的众所周知的滤波器(MMF)估计器的错配滤波器(MMF)估计器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号