A Novel Schema-Oriented Approach for Chinese New Word Identification

机译：一种面向图式的中文新词识别方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the popularity of network application-s, new words become more common and bring the poor performance of natural language processing related applications including web search. Identifying new words automatically from texts is still a very challenging problem, especially for Chinese. In this paper, we propose a novel schema-oriented approach for Chinese new word i-dentification (named "ChNWI"). This approach has three main steps: (1) we suggest three composition schemas that cover nearly all two-character up to four-character Chinese word surfaces; (2) we employ support vector machine (SVM) to classify Chinese new words of three schemas using their u-nique linguistic characteristics; and (3) we design various rules to filter identified Chinese new words of three schemas. Our extensive evaluations with two corpora (Chinese news titles and CIPS-SIGHAN 2012 CSMB) show ChNWI's efficiency on Chinese new word identification.

机译：随着网络应用程序的普及，新单词变得越来越普遍，并且带来了与自然语言处理相关的应用程序（包括网络搜索）的不良性能。从文本自动识别新词仍然是一个非常具有挑战性的问题，尤其是对于中文而言。在本文中，我们提出了一种新颖的面向模式的方法，用于中文新词i-identification（名为“ ChNWI”）。这种方法包括三个主要步骤：（1）我们提出了三种组成模式，它们涵盖了几乎所有两个字符到四个字符的中文单词表面; （2）我们采用支持向量机（SVM）根据它们的u-nique语言特征对三种模式的汉语新词进行分类。（3）设计各种规则来过滤已识别的三个图式的中文新词。我们使用两个语料库（中文新闻标题和CIPS-SIGHAN 2012 CSMB）进行了广泛的评估，显示了ChNWI在中文新词识别上的效率。

著录项

来源
《Pacific Asia Conference on Language, Information and Computation》|2013年|108-117|共10页
会议地点
作者
Zhao Lu; Zhixian Yan; Junzhong Gu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Writer identification approach by holistic graphometric features using off-line handwritten words [J] . Vasquez Jose L., Ravelo-Garcia Antonio G., Alonso Jesus B., Neural computing & applications . 2020,第20期

机译：使用离线手写单词的整体图形特征作者识别方法
2. Writer identification approach based on bag of words with OBI features [J] . Durou Amal, Aref Ibrahim, Al-Maadeed Somaya, Information Processing & Management . 2019,第2期

机译：基于具有OBI功能的词袋的作家识别方法
3. DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors [J] . Wen Zhang, Qiang Wang, Xiangjun Li, 系统科学与系统工程学报（英文版） . 2019,第006期

机译：DCWord：一种通过词向量进行欺骗性评论识别的新型深度学习方法
4. A Novel Schema-Oriented Approach for Chinese New Word Identification [C] . Zhao Lu, Zhixian Yan, Junzhong Gu Pacific Asia Conference on Language, Information and Computation . 2013

机译：中国新单词识别的新颖的面向架构的方法
5. Word identification and eye movements in reading Chinese: A modeling approach. [D] . Tsai, Chih-Hao. 2001

机译：阅读中文中的单词识别和眼球运动：一种建模方法。
6. Does a picture is worth 1000 words apply to iconic Chinese words? Relationship of Chinese words and pictures [O] . Shih-Yu Lo, Su-Ling Yeh -1

机译：一幅价值一千字的图片是否适用于标志性的汉字？中文单词和图片的关系
7. A Novel Schema-Oriented Approach for Chinese New Word Identification [O] . Lu Zhao, Yan Zhixian, Gu Junzhong 2013

机译：一种面向图式的中文新词识别方法

A Novel Schema-Oriented Approach for Chinese New Word Identification

摘要

著录项

相似文献

相关主题

期刊订阅