...
首页> 外文期刊>Computational linguistics >Stable Classification of Text Genres
【24h】

Stable Classification of Text Genres

机译:文字类型的稳定分类

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Every text has at least one topic and at least one genre. Evidence for a text's topic and genre comes, in part, from its lexical and syntactic features—features used in both Automatic Topic Classification and Automatic Genre Classification (AGC). Because an ideal AGC system should be stable in the face of changes in topic distribution, we assess five previously published AGC methods with respect to both performance on the same topic–genre distribution on which they were trained and stability of that performance across changes in topic–genre distribution. Our experiments lead us to conclude that (1) stability in the face of changing topical distributions should be added to the evaluation critera for new approaches to AGC, and (2) Part-of-Speech features should be considered individually when developing a high-performing, stable AGC system for a particular, possibly changing corpus.
机译:每个文本至少具有一个主题和至少一个体裁。文本主题和类型的证据部分来自其词法和句法特征-自动主题分类和自动类型分类(AGC)中使用的功能。因为理想的AGC系统在面对主题分布变化时应该是稳定的,所以我们针对同一个主题的表现(经过训练的体裁分布)以及跨主题变化的表现稳定性评估了五种以前发布的AGC方法–流派。我们的实验使我们得出以下结论:(1)面对不断变化的主题分布,应将稳定性添加到AGC新方法的评估标准中;(2)在开发高级别的广告素材时,应单独考虑词性功能为特定的可能变化的语料库执行稳定的AGC系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号