首页> 外文期刊>Bioinformatics >Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity
【24h】

Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity

机译:通过合并MeSH语义相似度来增强MEDLINE文档聚类

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Clustering MEDLINE documents is usually conducted by the vector space model, which computes the content similarity between two documents by basically using the inner-product of their word vectors. Recently, the semantic information of MeSH (Medical Subject Headings) thesaurus is being applied to clustering MEDLINE documents by mapping documents into MeSH concept vectors to be clustered. However, current approaches of using MeSH thesaurus have two serious limitations: first, important semantic information may be lost when generating MeSH concept vectors, and second, the content information of the original text has been discarded.
机译:动机:MEDLINE文档的聚类通常是通过向量空间模型进行的,该模型通过基本使用两个词向量的内积来计算两个文档之间的内容相似度。最近,通过将文档映射到要聚类的MeSH概念向量中,将MeSH(医学主题词)同义词库的语义信息应用于聚类MEDLINE文档。但是,当前使用MeSH词库的方法有两个严重的局限性:第一,重要的语义信息在生成MeSH概念向量时可能会丢失;第二,原始文本的内容信息已被丢弃。

著录项

  • 来源
    《Bioinformatics》 |2009年第15期|p.1944-1951|共8页
  • 作者单位

    1 Shanghai Key Lab of Intelligent Information Processing, Fudan University, 2 School of Computer Science, Fudan University, Shanghai 200433, China, 3 Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong and 4 Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号