...
首页> 外文期刊>Knowledge and information systems >Composition pattern oriented tag extraction from short documents using a structural learning method
【24h】

Composition pattern oriented tag extraction from short documents using a structural learning method

机译:使用结构学习方法从短文档中提取面向构图模式的标签

获取原文
获取原文并翻译 | 示例

摘要

With the rapid growth of web, automatic tagging that detects informative terms from a document becomes an important problem for information aggregation and sharing services. In particular, automatic tagging for short documents becomes more interesting as many users are increasingly publishing information through social media services which encourage users to create the documents of short length. In this paper, we propose a novel automatic tagging model for short text documents from social media services, following the framework of supervised learning. We redefine traditional frequency-based term features so that they can address the properties of the documents created under length limitation and consider sequential dependencies between successive terms in a document based on a structural support vector machine. In addition, our proposed approach incorporates composition patterns by which users put informative terms into their documents. Extensive experiments have been conducted to validate the presented approach, and it was found that the proposed term features were effective for extracting tags, and the tag extractor trained by considering the sequential dependencies and composition patterns achieved superior performance results over the existing alternative methods.
机译:随着Web的快速发展,从文档中检测信息术语的自动标记成为信息聚合和共享服务的重要问题。特别是,由于许多用户越来越多地通过社交媒体服务发布信息,从而鼓励用户创建短文档,因此自动标记短文档变得更加有趣。在本文中,我们根据监督学习的框架,为社交媒体服务中的短文本文档提出了一种新颖的自动标记模型。我们重新定义了传统的基于频率的术语功能,以便它们可以解决在长度限制下创建的文档的属性,并基于结构支持向量机考虑文档中连续术语之间的顺序依赖性。另外,我们提出的方法结合了组合模式,用户可以通过该组合模式在他们的文档中添加信息术语。已经进行了广泛的实验以验证所提出的方法,并且发现所提出的术语特征对于提取标签是有效的,并且通过考虑顺序依赖性和组成模式而训练的标签提取器在性能上优于现有的替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号