New Word Detection in Ancient Chinese Literature

机译：中国古代文学新词检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Mining Ancient Chinese corpus is not as convenient as modern Chinese, because there is no complete dictionary of ancient Chinese words which leads to the bad performance of tokenizers. So finding new words in ancient Chinese texts is significant. In this paper, the Apriori algorithm is improved and used to produce candidate character sequences. And a long short-term memory (LSTM) neural network is used to identify the boundaries of the word. Furthermore, we design word confidence feature to measure the confidence score of new words. The experimental results demonstrate that the improved Apriori-like algorithm can greatly improve the recall rate of valid candidate character sequences, and the average accuracy of our method on new word detection raise to 89.7%.

机译：矿业古代汉语语料库并不像现代中文那么方便，因为没有完整的古代汉语词汇词典，导致令牌的糟糕表现。所以在古代中文文本中找到新的单词是重要的。在本文中，改进了APRiori算法并用于产生候选字符序列。和长期内存（LSTM）神经网络用于识别单词的边界。此外，我们设计了单词信心功能，以衡量新词的信心。实验结果表明，改进的Apriori样算法可以大大提高有效候选字符序列的召回率，以及我们对新词检测方法的平均准确性升高到89.7％。

著录项

来源
《Asia Pacific Web and Web-Age Information Management》|2017年|362p|共16页
会议地点
作者
Tao Xie; Bin Wu; Bai Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP393-53;
关键词
New word detection; Ancient chinese literature; Apriori-like; Neural network; Word confidence;

机译：新词检测;中国古代文学;APRiori样;神经网络;词自信;

相似文献

外文文献
中文文献
专利

1. Chinese WeChat and Blog Hot Words Detection Method Based on Chinese Semantic Clustering [J] . Wang Yu, Song Sixin, Zhou Fanfan, Intelligent automation and soft computing . 2017,第4期

机译：基于中文语义聚类的中文微信和博客热门词检测方法
2. Formation of black patina on an ancient Chinese bronze sword of the Warring States Period [J] . Li Bingjie, Jiang Xudong, Wu Renchao, Applied Surface Science . 2018,第OCTa15期

机译：黑色古铜色在战国时期的中国古代青铜剑上形成
3. PQAC-WN: constructing a wordnet for Pre-Qin ancient Chinese [J] . Zhang Yingjie, Li Bin, Dai Xinyu, Language Resources and Evaluation . 2017,第2期

机译：PQAC-WN：为先秦古代汉语构建词汇网
4. New Word Detection in Ancient Chinese Literature [C] . Tao Xie, Bin Wu, Bai Wang Aisa-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data . 2017

机译：中国古代文学中的新词发现
5. Zhiyin and Zhiyan, knowing notes and knowing words: Aurality and reality in ancient China. [D] . Berthel, Kenneth. 2010

机译：知音和知言，识音识词：中国古代的听觉与现实。
6. Does a picture is worth 1000 words apply to iconic Chinese words? Relationship of Chinese words and pictures [O] . Shih-Yu Lo, Su-Ling Yeh -1

机译：一幅价值一千字的图片是否适用于标志性的汉字？中文单词和图片的关系
7. The economic aspects of development of China reflected in modern Chinese literature (“Ten words about China” for Yu Hua) [O] . Шанин, А. А., Shanin, A. A. 2016

机译：现代中国文学反映了中国发展的经济方面（于华的“关于中国的十个词”）

New Word Detection in Ancient Chinese Literature

摘要

著录项

相似文献

相关主题

期刊订阅