首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Finding Related Forum Posts through Content Similarity over Intention-Based Segmentation
【24h】

Finding Related Forum Posts through Content Similarity over Intention-Based Segmentation

机译:通过基于意图的细分中的内容相似性查找相关的论坛帖子

获取原文
获取原文并翻译 | 示例

摘要

We study the problem of finding related forum posts to a post at hand. In contrast to traditional approaches for finding related documents that perform content comparisons across the content of the posts as a whole, we consider each post as a set of segments, each written with a different goal in mind. We advocate that the relatedness between two posts should be based on the similarity of their respective segments that are intended for the same goal, i.e., are conveying the same intention. This means that it is possible for the same terms to weigh differently in the relatedness score depending on the intention of the segment in which they are found. We have developed a segmentation method that by monitoring a number of text features can identify the parts of a post where significant jumps occur indicating a point where a segmentation should take place. The generated segments of all the posts are clustered to form intention clusters and then similarities across the posts are calculated through similarities across segments with the same intention. We experimentally illustrate the effectiveness and efficiency of our segmentation method and our overall approach of finding related forum posts.
机译:我们研究了在论坛上找到相关帖子的问题。与传统的查找相关文档以对整个帖子内容进行内容比较的传统方法相比,我们将每个帖子视为一组细分,每个细分都以不同的目标为目标。我们主张,两个职位之间的相关性应基于旨在实现同一目标的各自部门的相似性,即传达相同的意图。这意味着相同的术语可能会在相关性评分中的权重不同,具体取决于找到它们的句段的意图。我们已经开发出一种分割方法,该方法可以通过监视许多文本特征来识别帖子中发生明显跳转的部分,这些部分指示应该进行分割的点。将所有帖子的生成片段进行聚类以形成意图聚类,然后通过意图相同的片段之间的相似度来计算帖子之间的相似度。我们通过实验说明了细分方法以及查找相关论坛帖子的整体方法的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号