Models for improved tractability and accuracy in dependency parsing.

机译：用于在依赖关系分析中提高易处理性和准确性的模型。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic syntactic analysis of natural language is one of the fundamental problems in natural language processing. Dependency parses (directed trees in which edges represent the syntactic relationships between the words in a sentence) have been found to be particularly useful for machine translation, question answering, and other practical applications.;For English dependency parsing, we show that models and features compatible with how conjunctions are represented in treebanks yield a parser with state-of-the-art overall accuracy and substantial improvements in the accuracy of conjunctions.;For languages other than English, dependency parsing has often been formulated as either searching over trees without any crossing dependencies (projective trees) or searching over all directed spanning trees. The former sacrifices the ability to produce many natural language structures; the latter is NP-hard in the presence of features with scopes over siblings or grandparents in the tree.;This thesis explores alternative ways to simultaneously produce crossing dependencies in the output and use models that parametrize over multiple edges.;Gap inheritance is introduced in this thesis and quantifies the nesting of subtrees over intervals. The thesis provides O( n6) and O(n 5) edge-factored parsing algorithms for two new classes of trees based on this property, and extends the latter to include grandparent factors.;This thesis then defines 1-Endpoint-Crossing trees, in which for any edge that is crossed, all other edges that cross that edge share an endpoint. This property covers 95.8% or more of dependency parses across a variety of languages. A crossing-sensitive factorization introduced in this thesis generalizes a commonly used third-order factorization (capable of scoring triples of edges simultaneously).;This thesis provides exact dynamic programming algorithms that find the optimal 1-Endpoint-Crossing tree under either an edge-factored model or this crossing-sensitive third-order model in O(n 4) time, orders of magnitude faster than other mildly non-projective parsing algorithms and identical to the parsing time for projective trees under the third-order model. The implemented parser is significantly more accurate than the third-order projective parser under many experimental settings and significantly less accurate on none.

机译：自然语言的自动句法分析是自然语言处理中的基本问题之一。已经发现依赖解析（有向树的边缘代表句子中单词之间的句法关系）对于机器翻译，问题回答和其他实际应用特别有用;对于英语依赖解析，我们展示了模型和功能与树库中的连词表示方式兼容可以产生具有最先进的整体准确性并大大提高连词准确性的解析器;对于英语以外的其他语言，依赖性解析通常被表示为在没有任何树的情况下搜索树越过依赖性（投影树）或搜索所有有向生成树。前者牺牲了产生许多自然语言结构的能力。后者在树中具有在兄弟姐妹或祖父母中具有范围的特征的情况下是NP困难的。；本文探索了在输出中同时产生交叉依赖性并使用对多个边进行参数化的模型的替代方法。本论文并量化了子树在时间间隔上的嵌套。本文基于此属性为两类新树提供了O（n6）和O（n 5）边缘因子解析算法，并将后者扩展为包括祖父母因子。;然后，本文定义了1-端点穿越树，其中，对于任何相交的边，与该边相交的所有其他边都共享一个端点。此属性涵盖各种语言的95.8％或更多的依赖项解析。本文引入的交叉敏感因式分解概括了一种常用的三阶因式分解（能够同时对边缘的三倍进行评分）。本论文提供了精确的动态规划算法，可以在任一边缘下找到最优的1-End-Crossing树分解模型或此交叉敏感的三阶模型的时间为O（n 4），比其他轻度非投影解析算法快几个数量级，并且与三阶模型下投影树的解析时间相同。在许多实验设置下，已实现的解析器比三阶投影解析器要准确得多，而在任何情况下，解析器的准确性都将大大降低。

著录项

作者
Pitler, Emily.;
展开▼
作者单位

University of Pennsylvania.;

展开▼
授予单位 University of Pennsylvania.;
学科 Computer science.
学位 Ph.D.
年度 2013
页码 166 p.
总页数 166
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Improving Graph-Based Dependency Parsing Models With Dependency Language Models [J] . Zhang M., Chen W., Duan X., Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第11期

机译：使用依赖语言模型改进基于图的依赖分析模型
2. Accuracy and tractability of a kriging model of intramolecular polarizable multipolar electrostatics and its application to histidine [J] . Kandathil S.M., Fletcher T.L., Yuan Y., Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2013,第21a22期

机译：分子内可极化多极静电的克里金模型的准确性和可扩展性及其在组氨酸中的应用
3. Variable time-step: A method for improving computational tractability for energy system models with long-term storage [J] . Paul de Guibert, Behrang Shirizadeh, Philippe Quirion Energy . 2020,第Deca15期

机译：可变时间步骤：通过长期存储改善能量系统模型的计算途径的方法
4. GIVING SHAPE TO AN N-VERSION DEPENDENCY PARSER: Improving Dependency Parsing Accuracy for Spanish using Maltparser [C] . Miguel Ballesteros, Jesus Herrera, Virginia Francisco, International Conference on Knowledge Discovery and Information Retrieval . 2010

机译：给予n-version依赖性解析器的形状：使用MARTPARSer提高西班牙语的依赖性解析精度
5. Sensitivity analysis in performance modeling of multicomputer networks: A methodology to improve the simulation efficiency while maintaining the modeling accuracy in performance modeling of complex multicomputer systems [D] . Han, Gang 2000

机译：多计算机网络性能建模中的敏感性分析：一种在保持复杂多计算机系统性能建模的准确性的同时提高仿真效率的方法
6. The impact of modeling the dependencies among patient findings on classification accuracy and calibration. [O] . S. Monti, G. F. Cooper 1998

机译：对患者发现之间的依存关系进行建模对分类准确性和校准的影响。
7. Cutset Networks: A Simple, Tractable, and Scalable Approach for Improving the Accuracy of Chow-Liu Trees [O] . Tahrima Rahman, Prasanna Kothalkar, Vibhav Gogate 2015

机译：Cutset Networks：一种简单，可追踪，可扩展的方法，用于提高Chow-Liu树的准确性

Models for improved tractability and accuracy in dependency parsing.

摘要

著录项

相似文献

相关主题

期刊订阅