首页> 外文会议>Pacific-Asia conference on knowledge discovery and data mining >Unsupervised Feature Weighting Based on Local Feature Relatedness
【24h】

Unsupervised Feature Weighting Based on Local Feature Relatedness

机译:基于本地特征相关性的无监督功能加权

获取原文

摘要

Feature weighting plays an important role in text clustering. Traditional feature weighting is determined by the syntactic relationship between feature and document (e.g. TF-IDF). In this paper, a seman-tically enriched feature weighting approach is proposed by introducing the semantic relationship between feature and document, which is implemented by taking account of the local feature relatedness — the related-ness between feature and its contextual features within each individual document. Feature relatedness is measured by two methods, document collection-based implicit relatedness measure and Wikipedia link-based explicit relatedness measure. Experimental results on benchmark data sets show that the new feature weighting approach surpasses traditional syntactic feature weighting. Moreover, clustering quality can be further improved by linearly combining the syntactic and semantic factors. The new feature weighting approach is also compared with two existing feature relatedness-based approaches which consider the global feature relatedness (feature relatedness in the entire feature space) and the inter-document feature relatedness (feature relatedness between different documents) respectively. In the experiments, the new feature weighting approach outperforms these two related work in clustering quality and costs much less computational complexity.
机译:特征加权起着文本聚类中起重要作用。传统的特征加权通过特征和文件(例如,TF-IDF)之间的句法关系决定。在本文中,一个塞曼-角度讲富集特征加权方法,提出了通过引入特征和文档之间的语义关系,这是通过采取局部特征关联的帐户实现 - 每个单独的文件中的相关性特征和其上下文特征之间。功能相关性通过两种方法,基于收集文档隐含相关性测量和基于维基百科的链接关联明确的衡量测量。在基准数据集上的实验结果表明,新的特征加权方法优于传统的语法特征加权。此外,聚类质量可进一步通过线性组合的句法和语义因素提高。新的特征加权方法也有其考虑的全局特征关联性(在整个特征空间功能关联性),并分别在文档间功能关联性(不同的文档之间的关联功能)两个现有基于特征的关联性 - 的方法相比。在实验中,新的特征加权方法比在聚类质量和成本要少得多的计算复杂性这两个相关工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号