首页> 外文学位 >Learning Regular Languages and Automaton Graphs.
【24h】

Learning Regular Languages and Automaton Graphs.

机译:学习常规语言和自动机图。

获取原文
获取原文并翻译 | 示例

摘要

Learning regular languages has long been a fundamental topic in computational learning theory. In this thesis, we present our contributions to exploring the learnability of regular languages and their representation class, deterministic finite automata (DFAs).;To study the learnability of regular languages in the context of machine learning, we first need to understand how humans learn and acquire a language. We consider a society which consists of n people (or agents), where pairs of individuals are drawn uniformly at random to interact. Each individual has a confidence level for a grammar and a more confident person supports the grammar with higher probability. A person increases her confidence level when interacting with another person supporting the grammar, and decreases her confidence level otherwise. We prove that with high probability the three-state binary signaling process reaches consensus after &THgr;(n log n) interactions in the worst case, regardless of the initial configuration. In the general case, the continuous-time binary signaling process in the limit will converge within O(r log nr) time (corresponding to O( nr log nr) interactions in expectation) if the initial configuration is monotone, where r is the number of confidence levels. In the other direction, we also show a convergence lower bound O (nr + n log n) on the number of interactions when r is large.;The class of shuffle ideals is an important sub-family of regular languages. The shuffle ideal generated by a string set U is the collection of all strings containing some string u epsilon U as a (not necessarily contiguous) subsequence. We study the PAC learnability of shuffle ideals and present positive results on this learning problem under element-wise independent and identical distributions and Markovian distributions in the statistical query model. A constrained generalization to learning shuffle ideals under product distributions is also provided. In the empirical direction, we propose a heuristic algorithm for learning shuffle ideals from given labeled strings under general unrestricted distributions.;As a representation class of regular languages, DFAs are one of the most elementary computational models in the study of computer science. We study the learnability of a random DFA and propose a computationally efficient algorithm for learning and recovering a random DFA from uniform input strings and state information in the statistical query model. A random DFA is uniformly generated: for each state-symbol pair (q epsilon Q, sigma epsilon Sigma), we choose a state q' epsilon Q with replacement uniformly and independently at random and let ϕ(q, sigma) = q', where Q is the state space, Sigma is the alphabet and ϕ is the transition function. The given data are string-state pairs (&khgr;, q') where &khgr; is a string drawn uniformly at random and q is the state of the DFA reached on input &khgr; starting from the start state q0. A theoretical guarantee on the absolute error of the algorithm in the statistical query model is presented.;Given that automaton graphs are out-regular, we generalize our DFA learning algorithm to learning random regular graphs in the statistical query model from random paths. In 'a standard label-guided graph exploration setting, the edges incident from a node in the graph have distinct local labels. The input data to the statistical query oracle are path-vertex pairs (&khgr;, upsilon) where &khgr; is a random uniform path (a random sequence of edge labels) and upsilon is the vertex of the graph reached on the path x starting from a particular start vertex upsilon&ogr;.
机译:长期以来,学习常规语言一直是计算学习理论的基本主题。在这篇论文中,我们提出了对探索常规语言及其表示形式的类,确定性有限自动机(DFA)的可学习性的贡献。;要在机器学习的背景下研究常规语言的可学习性,我们首先需要了解人类如何学习并获得一种语言。我们考虑一个由n个人(或代理人)组成的社会,其中成对的个体随机地均匀地画画以进行交互。每个人都有一个语法的置信度,一个更有信心的人会以较高的概率支持该语法。一个人在与另一个支持语法的人互动时会增加其置信度,否则会降低其置信度。我们证明,在最坏的情况下,无论初始配置如何,三态二进制信令过程在最差情况下经过(n log n)交互作用后都达到共识。在一般情况下,如果初始配置是单调的,则极限中的连续时间二进制信令过程将在O(r log nr)时间内收敛(与预期的O(nr log nr)交互作用相对应),其中r是数字信心水平。在另一个方向上,当r较大时,我们还显示了交互次数上的收敛下界O(nr + n log n)。混洗理想的一类是常规语言的重要子族。字符串集U产生的随机混合理想是所有包含某个字符串u epsilon U(不一定是连续的)子序列的所有字符串的集合。我们研究了随机理想的PAC可学习性,并在统计查询模型中的元素独立且相同的分布和Markovian分布下,对该学习问题给出了积极的结果。还提供了一种约束泛化,用于学习产品分布下的混洗理想。在经验方向上,我们提出了一种启发式算法,用于在一般无限制分布下从给定的标记字符串中学习混洗理想。;作为常规语言的表示类,DFA是计算机科学研究中最基本的计算模型之一。我们研究了随机DFA的可学习性,并提出了一种计算有效的算法,用于从统计查询模型中的统一输入字符串和状态信息中学习和恢复随机DFA。均匀生成一个随机DFA:对于每个状态符号对(q epsilon Q,sigma epsilon Sigma),我们选择一个状态q'epsilon Q,该值均匀且独立地随机地替换,并令(q,sigma)= q' ,其中Q是状态空间,Sigma是字母,ϕ是过渡功能。给定的数据是字符串状态对(&khgr ;, q'),其中&khgr;是随机均匀绘制的字符串,q是在输入&khgr;上达到的DFA的状态。从起始状态q0开始。给出了统计查询模型中算法绝对误差的理论保证。鉴于自动机图是不规则的,我们将DFA学习算法推广到从随机路径中学习统计查询模型中的随机正则图。在标准的标签引导图探索设置中,从图中的节点入射的边具有不同的局部标签。统计查询oracle的输入数据是路径-顶点对(&khgr ;, upsilon),其中&khgr;是一个随机的均匀路径(边缘标签的一个随机序列),upsilon是从特定起始顶点upsilon&ogr;开始在路径x上到达的图形的顶点。

著录项

  • 作者

    Chen, Dongqu.;

  • 作者单位

    Yale University.;

  • 授予单位 Yale University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 226 p.
  • 总页数 226
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号