首页> 外文会议>International Conference on Intelligent Information Hiding and Multimedia Signal Processing >A Multilingual Patent Text-Mining Approach for Computing Relatedness Evaluation of Patent Documents
【24h】

A Multilingual Patent Text-Mining Approach for Computing Relatedness Evaluation of Patent Documents

机译:用于计算专利文献的相关性评估的多语言专利文本挖掘方法

获取原文

摘要

This paper describes our work on developing a language-independent technique for discovery of implicit knowledge about patents from multilingual patent information sources. Traditional techniques of multi- and cross-language patent retrieval are mostly based on the process of translation. One major problem of those is that it is difficult to find related patents produced from other countries in a stand-alone patent information system. In this paper, we present a novel system platform to support locating similar and relevant multilingual patent documents. The platform is developed using a multilingual vector space based on the latent semantic indexing (LSI) model, and utilizing collected professional Chinese-English parallel corpora for training the system model. These multilingual patent documents can then be mapped into the semantic vector space for evaluating their similarity by means of text clustering techniques. The preliminary results show that our platform framework has potential for retrieval and relatedness evaluation of multilingual patent documents.
机译:本文介绍了我们在开发独立语言的技术方面,以发现关于多语言专利信息来源专利的隐性知识。多种和跨语言专利检索的传统技术主要基于翻译过程。其中一个主要问题是,很难找到从独立专利信息系统中的其他国家生产的相关专利。在本文中,我们提出了一种新颖的系统平台,支持定位类似和相关的多语言专利文献。该平台是使用基于潜在语义索引(LSI)模型的多语言矢量空间开发的,利用收集的专业中英语并行语料库来训练系统模型。然后,这些多语言专利文档可以映射到语义矢量空间中,以通过文本聚类技术来评估它们的相似性。初步结果表明,我们的平台框架具有多语言专利文献的检索和相关性评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号