首页> 外文会议> >Automatic Chinese unknown word extraction using small-corpus-based method

【24h】

Automatic Chinese unknown word extraction using small-corpus-based method

机译：基于小语料库的中文未知词自动提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Chinese unknown word extraction is an important problem for Chinese language processing. There are troublesome difficulties in the problem. First, almost any Chinese character can either represent a word or be a part of other words. Secondly, there is no blank between Chinese words for identifying the boundaries. Although some approaches have been proposed, there are some drawbacks in these methods. Here, we present and develop a method to extract Chinese unknown words more efficiently and precisely. It retains efficiency and accuracy even though the size of document set is small for training. It can also extract the unknown words occur rarely. Based on these advantages, it is very practical for real applications.

机译：中文未知词提取是中文处理的一个重要问题。这个问题有麻烦的困难。首先，几乎任何汉字都可以代表一个单词或成为其他单词的一部分。其次，中文单词之间没有空白来标识边界。尽管已经提出了一些方法，但是这些方法存在一些缺点。在这里，我们提出并开发了一种更有效，更准确地提取中文未知单词的方法。即使用于培训的文档集很小，它仍然可以保持效率和准确性。它还可以提取未知单词，很少出现。基于这些优点，对于实际应用非常实用。

著录项

来源
《》|2003年|p.459-464|共6页
会议地点
作者
Tao-Hsing Chang; Chia-Hoang Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词
natural languages; word processing; character recognition; linguistics; Chinese unknown word extraction; corpus-based method; Chinese language processing;

机译：自然语言;词处理;字符识别;语言学;汉语未知词提取;基于语料的方法;汉语处理;

相似文献

外文文献
中文文献
专利

1. Automatic Extraction Of New Words Based On Google News Corpora For Supporting Lexicon-based Chinese Word Segmentation Systems [J] . Chin-Ming Hong, Chih-Ming Chen, Chao-Yang Chiu Expert systems with applications . 2009,第2p2期

机译：基于Google新闻语料库的自动提取新词以支持基于词典的中文分词系统
2. An Iterative Method for Extracting Chinese Unknown Words [J] . HE Shan, ZHU Jie 中国电子杂志（英文版） . 2001,第004期

机译：一种提取中文未知词的迭代方法
3. Automatic Microblog-Oriented Unknown Word Recognition with Unsupervised Method [J] . HUANG Degen, ZHANG Jing, HUANG Kaiyu 电子学报（英文版） . 2018,第001期

机译：基于无监督方法的面向微博的未知单词自动识别
4. AUTOMATIC CHINESE UNKNOWN WORD EXTRACTION USING SMALL-CORPUS-BASED METHOD [C] . Tao-Hsing Chang, Chia-Hoang Lee International Conference on Natural Language Processing and Knowledge Engineering; 20031026-20031029; Beijing; CN . 2003

机译：基于小Corpus方法的自动中文未知词提取
5. Defining and automatically identifying words in Chinese. [D] . Xue, Nianwen. 2002

机译：定义并自动识别中文单词。
6. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction [O] . Elham Najafi, Amir H. Darooneh -1

机译：文本中词的分形模式：一种自动关键词提取方法
7. Automatic Chinese unknown word extraction using small-corpus-based method [O] . Tao-hsing Chang, Chia-hoang Lee 2003

机译：基于小语料库的自动汉语未知单词提取

Automatic Chinese unknown word extraction using small-corpus-based method

摘要

著录项

相似文献

相关主题

期刊订阅