首页> 外国专利> Method and apparatus for finding related collections of linked documents using co-citation analysis

Method and apparatus for finding related collections of linked documents using co-citation analysis

机译:使用共引分析来查找链接文档的相关集合的方法和设备

摘要

A method and apparatus for identifying related collections of linked documents. In the method the links from a set of related documents are analyzed to identify a plurality of document collections. By analyzing only the link structure, a process intensive content analysis of the documents is avoided. A citation analysis technique, such as co-citation analysis, is performed on the set of documents to extract link information indicating links and link frequency between document collections. For co-citation analysis that information would include the frequency that both are linked to by another document collection. By using the link information, related document collections may then be identified using a suitable analysis technique, such as clustering or spreading activation.
机译:一种用于识别链接文档的相关集合的方法和设备。在该方法中,来自一组相关文档的链接被分析以识别多个文档集合。通过仅分析链接结构,可以避免对文档进行过程密集的内容分析。对文档集执行引用分析技术,例如共引用分析,以提取表示链接和文档集合之间的链接频率的链接信息。对于共引分析,该信息应包括二者之间由另一个文档集合链接的频率。通过使用链接信息,然后可以使用适当的分析技术(例如,群集或扩展激活)来标识相关文档集合。

著录项

  • 公开/公告号US6457028B1

    专利类型

  • 公开/公告日2002-09-24

    原文格式PDF

  • 申请/专利权人 XEROX CORPORATION;

    申请/专利号US19990407787

  • 发明设计人 PETER L. PIROLLI;JAMES E. PITKOW;

    申请日1999-09-29

  • 分类号G06F172/10;

  • 国家 US

  • 入库时间 2022-08-22 00:47:59

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号