Dropped personal pronoun recovery in Chinese SMS

CHRIS  GIANNELLA; RANSOM  WINDER; STACY PETERSEN

首页> 外文期刊>Natural language engineering >Dropped personal pronoun recovery in Chinese SMS

【24h】

Dropped personal pronoun recovery in Chinese SMS

机译：中文短信中的人称代词恢复下降

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In written Chinese, personal pronouns are commonly dropped when they can be inferred from context. This practice is particularly common in informal genres like Short Message Service messages sent via cell phones. Restoring dropped personal pronouns can be a useful preprocessing step for information extraction. Dropped personal pronoun recovery can be divided into two subtasks: (1) detecting dropped personal pronoun slots and (2) determining the identity of the pronoun for each slot. We address a simpler version of restoring dropped personal pronouns wherein only the person numbers are identified. After applying a word segmenter, we used a linear-chain conditional random field to predict which words were at the start of an independent clause. Then, using the independent clause start information, as well as lexical and syntactic information, we applied a conditional random field or a maximum-entropy classifier to predict whether a dropped personal pronoun immediately preceded each word and, if so, the person number of the dropped pronoun. We conducted a series of experiments using a manually annotated corpus of Chinese Short Message Service. Our approaches substantially outperformed a rule-based approach based partially on rules developed by Chung and Gildea (2010, Effects of Empty Categories on Machine Translation. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. pp. 636-45). Our approaches also outperformed (though by a considerably smaller margin) a machine-learning approach based closely on work by Yang, Liu, and Xue in (2015, Recovering Dropped Pronouns from Chinese Text Messages. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics.

机译：用汉语书写的人称代词通常可以从上下文中推断出来。这种做法在非正式类型中尤其常见，例如通过手机发送的短消息服务消息。恢复掉落的人称代词可能是信息提取的有用预处理步骤。掉落的人称代词恢复可以分为两个子任务：（1）检测掉落的人称代词插槽；（2）确定每个插槽的代词身份。我们提供了一个更简单的版本，用于还原掉落的人称代词，其中仅识别人的号码。应用分词器后，我们使用线性链条件随机字段来预测哪些词在独立子句的开头。然后，使用独立的从句开始信息以及词汇和句法信息，我们应用条件随机字段或最大熵分类器来预测是否在每个单词之前都出现了被丢弃的人称代词，如果是，则是代词我们使用人工注释的中文短信服务语料库进行了一系列实验。我们的方法大大优于基于规则的方法，该方法部分基于Chung和Gildea制定的规则（2010年，空类别对机器翻译的影响。自然语言处理的经验方法会议论文集（EMNLP）。计算语言学协会。 636-45）。我们的方法也比机器学习方法（虽然幅度要小得多）优于（虽然幅度要小得多），这是基于Yang，Liu和Xue在（2015年，从中文短信中恢复掉的代名词）的工作。协会第53届年会论文集计算语言学（ACL）：计算语言学协会。

著录项

来源
《Natural language engineering》 |2017年第6期|905-927|共23页
作者
CHRIS GIANNELLA; RANSOM WINDER; STACY PETERSEN;
展开▼
作者单位

Department of Human Language Technology, The MITRE Corporation, 7515 Colshire Drive, McLean, VA, 22102, USA;

Department of Human Language Technology, The MITRE Corporation, 7515 Colshire Drive, McLean, VA, 22102, USA;

Department of Human Language Technology, The MITRE Corporation, 7515 Colshire Drive, McLean, VA, 22102, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Neural recovery machine for Chinese dropped pronoun [J] . Zhang Weinan, Liu Ting, Yin Qingyu, Frontiers of computer science in China . 2019,第5期

机译：汉语下降代词的神经恢复机
2. Neural recovery machine for Chinese dropped pronoun [J] . Zhang Weinan, Liu Ting, Yin Qingyu, Frontiers of computer science . 2019,第5期

机译：中国丢弃代词的神经回收机
3. Explicitation of Personal Pronoun in Chinese Translation [J] . ZHOU Ming-xia 文学与艺术研究：英文版 . 2021,第012期

机译：Explicitation of Personal Pronoun in Chinese Translation
4. A Joint Model for Dropped Pronoun Recovery and Conversational Discourse Parsing in Chinese Conversational Speech [C] . Jingxuan Yang, Kerui Xul, Jun Xu, Annual Meeting of the Association for Computational Linguistics;International Joint Conference on natural Language Processing . 2021

机译：中国对话演讲中丢弃代词恢复和会话话语解析的联合模型
5. The role of pronouns in the reading comprehension of Chinese-American children with autism spectrum disorders. [D] . Charles, Shirley. 2016

机译：代词在华裔美国自闭症谱系儿童阅读理解中的作用。
6. Personal pronoun usage in maternal input to infants at high vs. low risk for autism spectrum disorder [O] . Angela Xiaoxue He, Rhiannon Luyster, Sung Ju Hong, -1

机译：在自闭症谱系障碍的高风险和低风险婴儿中母亲代词中使用人称代词
7. Neural recovery machine for Chinese dropped pronoun [O] . Weinan Zhang, Ting Liu, Qingyu Yin, 2019

机译：中国丢弃代词的神经回收机
8. Blast Mitigation Seat Analysis - Assessment of the Effect of Personal Protective Equipment on the 5th Percentile Female Anthropomorphic Test Devices Performance in Drop Tower Evaluations (Briefing Charts). [R] . Bosch, K., Clark, D., Harris, K., 2015

机译：爆破缓解座椅分析 - 评估个人防护装备对落水塔评估中第5百分位女性拟人测试装置性能的影响（简报图表）。

Dropped personal pronoun recovery in Chinese SMS

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅