首页> 外文会议>Chinese lexical semantics workshop >A Chinese Sentence Segmentation Approach Based on Comma
【24h】

A Chinese Sentence Segmentation Approach Based on Comma

机译:基于逗号的中文句子分割方法

获取原文

摘要

Chinese sentence segmentation is considered to be a very fundamental step in natural language processing. A successful solution for sentence boundary detection is a key step in the subsequent NLP tasks, such as parsing and machine translation, etc. In this paper, we consider comma as a sign-of-the-sentence boundary, and then divide it into two major types, i.e., the true (EOS) and the pseudo (Non-EOS). Finally, a system framework of Chinese sentence segmentation based on two-layer classifiers is presented and implemented. The experimental results on Chinese Treebank 6.0. Results show that our model achieve the F-measure of 90.7% overall, which improves by 1.5%.
机译:汉语句子分割被认为是自然语言处理中非常基本的步骤。成功的句子边界检测解决方案是后续NLP任务(例如解析和机器翻译等)中的关键步骤。在本文中,我们将逗号视为句子的标志边界,然后将其分为两个主要类型,即真实(EOS)和伪(Non-EOS)。最后,提出并实现了一种基于两层分类器的汉语句段分割系统框架。在中国树库6.0上的实验结果。结果表明,我们的模型总体上实现了90.7%的F度量,提高了1.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号