Pronominal anaphora resolution in Chinese.

机译：汉语的代词照应解析。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Resolving pronominal anaphors in English has been a focus of research in natural language processing for decades. Methods ranging from linguistics-oriented, rule-based approaches to data-oriented, machine-learning approaches have been applied to the problem of finding the antecedents of pronouns.; In contrast to the abundance of research in English, there is almost no work on the problem in Chinese. This thesis addresses that gap.; Both a rule-based and a machine-learning anaphora resolution approach are presented in this work. An important difference between Chinese and English is that Chinese, unlike English, is a pro-drop language, and has null (zero) pronouns. The rule-based approach is applied to resolving these null pronouns as well as to the overt, third-person pronouns.; The Hobbs algorithm is used for the rule-based method of anaphora resolution. Three versions of the algorithm are presented. The first uses only syntactic structure to select an antecedent. The second uses limited number and gender agreement, while the third incorporates semantic constraints on the proposed antecedents.; For the machine-learning method, maximum entropy, supervised machine-learning models are used. Different models were trained using sets of features that paralleled the information sources used by the different versions of the Hobbs algorithm.; Two sets of data were used. The Penn Chinese Treebank provided the test data for resolution of both overt, third-person pronouns and of zero pronouns. The CTB parses were annotated for coreference using guidelines that were drawn up for the work presented here. Data annotated for the 2004 Chinese ACE program were used for training and testing the maximum entropy models to find the antecedents for overt, third-person pronouns.; The results from experiments with the two basic methods using the different levels of linguistic information will be presented and discussed.

机译：解决英语中的代词照应是数十年来自然语言处理研究的重点。从以语言学为基础的基于规则的方法到以数据为导向的机器学习方法的方法已经应用于寻找代词的先行问题。与大量的英语研究相比，中文问题几乎没有研究。本论文解决了这一差距。这项工作既提供了基于规则的学习方式又提供了机器学习的回指解决方法。汉语和英语之间的重要区别在于，与英语不同，汉语是亲语言，并具有空（零）代词。基于规则的方法适用于解析这些空代词以及明显的第三人称代词。 Hobbs算法用于基于规则的回指解析方法。提出了算法的三个版本。第一个仅使用语法结构来选择一个先行词。第二种使用有限的数量和性别协议，而第三种则在拟议的前件中加入了语义约束。对于机器学习方法，使用最大熵，监督的机器学习模型。使用与Hobbs算法的不同版本使用的信息源平行的特征集来训练不同的模型。使用了两组数据。宾州中文树库提供了用于显式第三人称代词和零代词解析的测试数据。使用针对此处介绍的工作制定的指南，对CTB解析进行了注释，以供共同参考。使用2004年中文ACE程序注释的数据来训练和测试最大熵模型，以找到明显的第三人称代词的先行词。将介绍和讨论使用两种不同语言信息水平的基本方法进行的实验结果。

著录项

作者
Converse, Susan P.;
展开▼
作者单位

University of Pennsylvania.;

展开▼
授予单位 University of Pennsylvania.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 140 p.
总页数 140
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Generic approach for Pronominal Anaphora and Zero Anaphora resolution in Arabic language [J] . Saoussen Mathlouthi Bouzid, Chiraz Ben Othmane Zribi Procedia Computer Science . 2020,第5期

机译：阿拉伯语中双相神经神经神经神经统治者分辨率的通用方法
2. An Integrated Framework for Pronominal Anaphora Resolution in Malayalam [J] . Ajees Arimbassery Pareed, Sumam Mary Idicula Advances in Science, Technology and Engineering Systems . 2019,第5期

机译：马拉雅拉姆语代词照应解析的综合框架
3. Automatic cohesive summarization with pronominal anaphora resolution [J] . Jamilson Antunes, Rafael Dueire Lins, Rinaldo Lima, Computer speech and language . 2018,第NOVa期

机译：具有代词照应分辨率的自动内聚摘要
4. Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser [C] . Christopher Kennedy, Branimir Boguraev International conference on computational linguistics;COLING-96 . 1996

机译：所有人的回指：没有解析器的代词回指解析
5. Corpus-based learning for pronominal anaphora resolution. [D] . Bergsma, Shane Anthony. 2005

机译：基于语料库的学习，用于代词回指解析。
6. GENERIC PRONOMINAL ANAPHORA : THE CASE OF THE ENGLISH SINGULAR THEY [O] . Morton Ann GERNSBACHER -1

机译：普通人代名词前肢：英语的奇异案例
7. An Integrated Framework for Pronominal Anaphora Resolution in Malayalam [O] . Ajees Arimbassery Pareed, Sumam Mary Idicula 2019

机译：Malayalam中的分解分辨率综合框架

Pronominal anaphora resolution in Chinese.

摘要

著录项

相似文献

相关主题

期刊订阅