首页> 外文会议>Brazilian Symposium in Information and Human Language Technology >Subtopic Annotation in a Corpus of News Texts: Steps Towards Automatic Subtopic Segmentation
【24h】

Subtopic Annotation in a Corpus of News Texts: Steps Towards Automatic Subtopic Segmentation

机译:新闻文本语料库中的子主题注释:迈向自动子主题分割的步骤

获取原文

摘要

Subtopic segmentation aims at finding the boundaries among text passages that represent different subtopics, which usually develop a main topic in a text. Being capable of automatically detecting subtopics is very useful for several Natural Language Processing applications. This paper describes subtopic annotation in a corpus of news texts written in Brazilian Portuguese. In particular, we focus on answering the main scientific questions regarding corpus annotation, aiming at both discussing and dealing with important annotation decisions and making available a reference corpus for research on subtopic structuring and segmentation.
机译:子主题分割旨在寻找代表不同子主题的文本段落之间的边界,这些边界通常会在文本中形成一个主要主题。能够自动检测子主题对于几种自然语言处理应用程序非常有用。本文描述了用巴西葡萄牙语编写的新闻文本语料库中的子主题注释。特别是,我们专注于回答有关语料注解的主要科学问题,旨在讨论和处理重要的注解决策,并为子主题结构化和分割研究提供参考语料库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号