Automatically Extracting Variant-Normalization Pairs for Japanese Text Normalization

机译：自动提取用于日文归一化的变体归一化对

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Social media texts, such as tweets from Twitter, contain many types of nonstandard tokens, and the number of normalization approaches for handling such noisy text has been increasing. We present a method for automatically extracting pairs of a variant word and its normal form from unsegmented text on the basis of a pair-wise similarity approach. We incorporated the acquired variant-normalization pairs into Japanese morphological analysis. The experimental results show that our method can extract widely covered variants from large Twitter data and improve the recall of normalization without degrading the overall accuracy of Japanese morphological analysis.

机译：社交媒体文本（例如来自Twitter的推文）包含许多类型的非标准标记，并且处理此类嘈杂文本的归一化方法的数量一直在增加。我们介绍了一种自动提取从未分段文本中提取的变体单词和其正常形式的方法，基于一对方面的相似性方法。我们将所获得的变形 - 归一化对纳入日语形态分析。实验结果表明，我们的方法可以从大型Twitter数据中提取广泛覆盖的变体，提高正常化召回，而不会降低日语形态分析的整体准确性。

著录项

来源
《International joint conference on natural language processing》|2017年|xxxv p. 516-1035|共10页
会议地点
作者
Itsumi Saito; Kyosuke Nishida; Kugatsu Sadamitsu; Kuniko Saito; Junji Tomita;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles [J] . Xu Rong, Wang QuanQiu Journal of biomedical informatics. . 2015,第1期

机译：从全文文章中结合自动表分类和关系提取在提取抗癌药物侧效对
2. Language-independent extractive automatic text summarization based on automatic keyword extraction [J] . Angel Hernandez-Castaneda, Rene Arnulfo Garcia-Hernandez, Yulia Ledeneva, Computer speech and language . 2022,第Jana期

机译：基于自动关键字提取的语言独立的提取自动文本摘要
3. Automatic normalization of short texts by combining statistical and rule-based techniques [J] . Marta R. Costa-jussa, Rafael E. Banchs Language Resources and Evaluation . 2013,第1期

机译：通过结合统计和基于规则的技术来自动规范短文本
4. Automatically Extracting Variant-Normalization Pairs for Japanese Text Normalization [C] . Itsumi Saito, Kyosuke Nishida, Kugatsu Sadamitsu, International joint conference on natural language processing . 2017

机译：自动提取变体-归一化对以进行日语文本归一化
5. Techniques for automatic normalization of orthographically variant Yiddish texts. [D] . Blum, Yakov Peretz. 2015

机译：正交变体意第绪文本的自动归一化技术。
6. Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles [O] . Rong Xu, QuanQiu Wang -1

机译：结合自动表格分类和关系提取从全文文章中提取抗癌药物副作用对
7. Human gene name normalization using text matching with automatically extracted synonym dictionaries [O] . Kevin Murphy, Yang Jin, Jessica S. Kim, 2006

机译：人类基因名称标准化使用文本匹配与自动提取的同义词词典

Automatically Extracting Variant-Normalization Pairs for Japanese Text Normalization

摘要

著录项

相似文献

相关主题

期刊订阅