首页> 外文学位 >Efficient non-detministic search in structured prediction: A case study on syntactic parsing.

【24h】

Efficient non-detministic search in structured prediction: A case study on syntactic parsing.

机译：结构化预测中的高效非确定性搜索：以句法分析为例。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Non-determinism occurs naturally in many search-based machine learning and natural language processing (NLP) problems. For example, the goal of parsing is to construct the syntactic tree structure of a sentence given a grammar. Agenda-based parsing is a dynamic programming approach to find the most likely syntactic tree of a sentence according to a probabilistic grammar. A chart is used to maintain all the possible subtrees for different spans in the sentence and an agenda is used to rank all the constituents. The parser chooses only one constituent from the agenda per step. Non-determinism occurs naturally in agenda-based parsing since the new constituent is often built by combining items from a few steps earlier.;Unfortunately, like most other problems in NLP, the size of the search space is huge and exhaustive search is impossible. However, users expect a fast and accurate system. In this dissertation, I focus on the question of ``Why, when, and how shall we take advantage of non-determinism?'' and show its efficacy to improve the parser in terms of speed and/or accuracy. Existing approaches like search-based imitation learning or reinforcement learning methods have different limitations when it comes to a large NLP system. The solution proposed in this dissertation is ``We should train the system non-deterministically and test it deterministically if possible.'' and I also show that ``it is better to learn with oracles than simple heuristics''.;We start by solving a generic Markov Decision Process with a non-deterministic agent. We show its theoretical convergence guarantees and verify its efficiency on maze solving problems. Then we focus on agenda-based parsing. To re-prioritize the parser, we model a decoding problem as a Markov Decision Process with a large state/action space. We discuss the advantages/disadvantages of existing techniques and propose a hybrid reinforcement/apprenticeship learning algorithm to trade off speed and accuracy. We also propose to use a dynamic pruner with features that depend on the run-time status of the chart and agenda and analyze the importance of those features in the pruning classification. Our models show comparable results with respect to start-of-the-art strategies.

机译：在许多基于搜索的机器学习和自然语言处理（NLP）问题中，非确定性自然发生。例如，解析的目的是构造给定语法的句子的句法树结构。基于议程的解析是一种动态编程方法，可以根据概率语法找到句子中最有可能的句法树。图表用于维护句子中不同跨度的所有可能的子树，而议程则用于排列所有成分。解析器每步仅从议程中选择一个构成要素。非确定性在基于议程的分析中很自然地发生，因为新的组成部分通常是通过合并前几个步骤中的项来构建的。不幸的是，与NLP中的大多数其他问题一样，搜索空间的大小很大，不可能进行详尽的搜索。但是，用户希望有一个快速而准确的系统。在本文中，我重点关注``为什么，何时以及如何利用非确定性的优势？''这一问题，并从速度和/或准确性方面展示了其提高解析器的功效。对于大型NLP系统，现有的基于搜索的模仿学习或强化学习方法等方法具有不同的局限性。本文提出的解决方案是：``我们应该对系统进行非确定性的训练，并在可能的情况下进行确定性的测试。''我还表明，``与oracle一起学习比简单的启发式更好''。使用非确定性主体解决通用的马尔可夫决策过程。我们显示了其理论收敛性保证，并验证了其在解决迷宫问题上的效率。然后，我们专注于基于议程的解析。为了重新确定解析器的优先级，我们将解码问题建模为具有较大状态/动作空间的马尔可夫决策过程。我们讨论了现有技术的优点/缺点，并提出了一种混合的强化/学徒学习算法来权衡速度和准确性。我们还建议使用动态修剪器，其功能取决于图表和议程的运行时状态，并分析这些功能在修剪分类中的重要性。我们的模型在最新战略方面显示出可比的结果。

著录项

作者
Jiang, Jiarong.;
展开▼
作者单位

University of Maryland, College Park.;

展开▼
授予单位 University of Maryland, College Park.;
学科 Computer Science.;Artificial Intelligence.
学位 Ph.D.
年度 2014
页码 180 p.
总页数 180
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The role of the IFG and pSTS in syntactic prediction: Evidence from a parametric study of hierarchical structure in fMRI [J] . Matchin William, Hammerly Christopher, Lau Ellen Cortex: A Journal Devoted to the Study of the Nervous System and Behavior . 2017,第期

机译：IFG和PST在句法预测中的作用：来自FMRI中分层结构的参数研究的证据
2. Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction [J] . de Oliveira Saulo H. P., Law Eleanor C., Shi Jiye, Bioinformatics . 2018,第7期

机译：顺序搜索导致更快，更高效的基于碎片的de novo蛋白质结构预测
3. Efficient HPLC method development using structure-based database search, physico-chemical prediction and chromatographic simulation [J] . Wang Lin, Zheng Jinjian, Gong Xiaoyi, Journal of Pharmaceutical and Biomedical Analysis: An International Journal on All Drug-Related Topics in Pharmaceutical, Biomedical and Clinical Analysis . 2015,第Null期

机译：使用基于结构的数据库搜索，理化预测和色谱模拟进行高效HPLC方法开发
4. A prediction study on e-commerce sales based on structure time series model and web search data [C] . Wei Dai, Geng Peng, Ying Liu, Chinese Control and Decision Conference . 2014

机译：基于结构时间序列模型和网络搜索数据的电子商务销售预测研究
5. Theoretical studies of proteins and protein-ligand complexes: Part I. Toward the development of an efficient all-atom implicit-solvent model for use in protein structure prediction. Part II. Molecular dynamics of studies of trypsin-ligand complexes. [D] . Guvench, Olgun. 2005

机译：蛋白质和蛋白质-配体配合物的理论研究：第一部分。建立用于蛋白质结构预测的有效全原子隐式溶剂模型。第二部分胰蛋白酶-配体复合物的分子动力学研究。
6. Sequential search leads to faster more efficient fragment-based de novo protein structure prediction [O] . Saulo H P de Oliveira, Eleanor C Law, Jiye Shi, -1

机译：顺序搜索导致更快更有效的基于片段的从头蛋白质结构预测
7. Multi-objectivization, fitness landscape transformation and search performance: A case of study on the hp model for protein structure prediction [O] . Mario Garza-Fabre, Gregorio Toscano-Pulido, Eduardo Rodriguez-Tello 2015

机译：多象化，健身景观转换和搜索性能：蛋白质结构预测HP模型研究
8. In Search of a Time Efficient Approach to Crack and Delamination Growth Predictions in Composites. [R] . Krueger, R., Carvalho, N. 2016

机译：寻找一种时间有效的方法来解决复合材料中的裂纹和分层增长预测问题。

Efficient non-detministic search in structured prediction: A case study on syntactic parsing.

摘要

著录项

相似文献

相关主题

期刊订阅