A Sentence Segmentation Method for Ancient Chinese Texts Based on NNLM

机译：基于NNLM的古代文本句子分割方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most of ancient Chinese texts have no punctuations or segmentation of sentences. Recent researches on automatic ancient Chinese sentence segmentation usually resorted to sequence labelling models and utilized small data sets. In this paper, we propose a sentence segmentation method for ancient Chinese texts based on neural network language models. Experiments on large-scale corpora indicate that our method is effective and achieves a comparable result to the traditional CRF model. Implementing sentence length penalty, using larger Simplified Chinese corpora, or dividing corpora by ages can further improve performance of our model.

机译：大多数古代汉语文本没有句子的标点或分割。最近关于自动古代句子分割的研究通常采用序列标签模型，并利用小数据集。本文提出了一种基于神经网络语言模型的古代文本句子分割方法。大型语料库的实验表明我们的方法是有效的，并实现了传统CRF模型的可比结果。实施句子长度惩罚，使用较大的简体中文集团，或者逐年分割Corpora可以进一步提高我们模型的表现。

著录项

来源
《Chinese Lexical Semantics Workshop》|2016年|772p|共10页
会议地点
作者
Boli Wang; Xiaodong Shi; Zhixing Tan; Yidong Chen; Weili Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Ancient Chinese; Sentence segmentation; Neural network language model;

机译：古代汉语;句子分割;神经网络语言模型;
入库时间 2022-08-20 23:09:50

相似文献

外文文献
中文文献
专利

1. Ancient Chinese Sentence Segmentation Based on Bidirectional LSTM+CRF Model [J] . Hongbin Wang, Haibing Wei, Jianyi Guo, Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2019,第4a138期

机译：基于双向LSTM + CRF模型的古代句子分割
2. HTML Text Segmentation for Web Page Summarization by a Key Sentence Extraction Method [J] . Wataru Sunayama, Akihiro Iyama, Masahiko Yachida Systems and Computers in Japan . 2006,第7期

机译：通过关键句提取方法进行网页摘要的HTML文本分段
3. AN ALGORITHM FOR CLASSIFYING EMOTION OF SENTENCES AND A METHOD TO DIVIDE A TEXT INTO SOME SCENES BASED ON THE EMOTION OF SENTENCES [J] . Hirotaka FUKOSHI, Futoshi SUGIMOTO, Masahide YONEYAMA 電子情報通信学会技術研究報告 . 2009,第373期

机译：句子情感分类算法及基于句子情感将文本划分为某些场景的方法
4. A Sentence Segmentation Method for Ancient Chinese Texts Based on NNLM [C] . Boli Wang, Xiaodong Shi, Zhixing Tan, Chinese lexical semantics workshop . 2016

机译：基于NNLM的古汉语文本句子分割方法
5. Development of an indigenous Chinese personality inventory based on the principle of yin-yang and the five elements and on the ancient Chinese text "Jen Wu Chih". [D] . Hsu, Chung-Jen. 2006

机译：根据阴阳原理和五项要素，并根据古代中文文本“仁武治”，发展一个土著华人个性清单。
6. The Origin and Dispersal of the Domesticated Chinese Oak Silkworm Antheraea pernyi in China: A Reconstruction Based on Ancient Texts [O] . Yanqun Liu, Yuping Li, Xisheng Li, 2010

机译：驯化的中国栎蚕（Antheraea pernyi）在中国的起源和传播：基于古代文献的重建
7. A Character-net Based Chinese Text Segmentation Method [O] . Lixin Zhou, Qun Liu 2008

机译：基于字符网的中文文本分割方法

A Sentence Segmentation Method for Ancient Chinese Texts Based on NNLM

摘要

著录项

相似文献

相关主题

期刊订阅