首页> 外文会议>International Conference on Data and Software Engineering >Rule based approach for text segmentation on Indonesian news article using named entity distribution
【24h】

Rule based approach for text segmentation on Indonesian news article using named entity distribution

机译:基于命名实体分布的基于规则的印度尼西亚新闻文章文本分割方法

获取原文

摘要

Finding good paragraph structure or text segmentation is important in computational linguistic research mainly in areas such as information retrieval, question answering and summarization. We proposed text segmentation by subtopic movement detection based on lexical and named entity distribution. Main contributions of this research are the usage of named entity (with reference resolution) and voting method in measuring text segment similarity, and also redefining rules on the text boundary identification for Indonesian news article. The experiments were done on 52 articles from Indonesian online news. The experimental results achieved 76,8%-79,55% accuracy compared to 60,37%-60,83% on the baseline of other research.
机译:在计算语言学研究中,主要是在信息检索,问题回答和摘要等领域中,找到合适的段落结构或文本分割很重要。我们提出了基于词法和命名实体分布的子主题运动检测进行文本分割。这项研究的主要贡献是使用命名实体(具有参考分辨率)和投票方法来测量文本段相似度,以及重新定义印尼新闻文章的文本边界标识规则。实验是根据来自印度尼西亚在线新闻的52篇文章进行的。实验结果达到了76.8%-79,55%的准确性,而其他研究基准为60.37%-60.83%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号