首页> 外文会议>International Conference on Text, Speech and Dialogue >Temporal Feature Space for Text Classification
【24h】

Temporal Feature Space for Text Classification

机译:文本分类的时间特征空间

获取原文

摘要

In supervised learning algorithms for text classification the text content is usually represented using the frequencies of the words it contains, ignoring their semantic and their relationships. Words within temporal expressions such as "today" or "last February" are particularly affected by this simplification: the same expression can have a different semantic in documents with different timestamps, while different expressions could refer to the same time. After extracting temporal expressions in documents, we model a set of temporal features derived from the time mentioned in the document, showing the relation between these features and the belonging category. We test our temporal approach on a subset of the New York Times corpus showing a significant improvement over the text-only baseline.
机译:在文本分类的监督学习算法中,文本内容通常使用它包含的单词的频率表示,忽略其语义及其关系。诸如“今天”或“去年2月”之类的时间表达式中的单词特别受到这种简化的影响:相同的表达式可以在具有不同时间戳的文档中具有不同的语义,而不同的表达式可以同时参考。在提取文档中提取时间表达式之后,我们模拟了一组从文档中提到的时间派生的时间特征,显示了这些功能与归属类别之间的关系。我们在纽约时报语料库的子集上测试我们的时间方法,显示出对仅限文本基线的重大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号