首页> 外文会议>International conference on computer science and it applications >IPC Multi-label Classification Applying the Characteristics of Patent Documents
【24h】

IPC Multi-label Classification Applying the Characteristics of Patent Documents

机译:IPC多标签分类应用专利文献的特征

获取原文

摘要

Most of research on the IPC automatic classification system has focused on applying various existing machine learning methods to the patent documents rather than considering the characteristics of the data or the structure of the patent documents. This paper, therefore, proposes using two structural fields, a technical field and a background field which are selected by applying the characteristics of patent documents and the role of the structural fields. A multi-label classification model is also constructed to reflect that a patent document could have multiple IPCs and to classify patent documents at an IPC subclass level comprised of 630 categories. The effects of the structural fields of the patent documents are examined using 564,793 registered patents in Korea. An 87.2 % precision rate is obtained when using the two fields mainly. From this sequence, it is verified that the technical field and background field play an important role in improving the precision of IPC multi-label classification at the IPC subclass level.
机译:关于IPC自动分类系统的大多数研究都集中在将各种现有机器学习方法应用于专利文献,而不是考虑数据的特征或专利文献的结构。因此,本文提出了通过应用专利文献的特征和结构领域的作用来选择的两个结构领域,技术领域和背景领域。还构造了多标签分类模型,以反映专利文献可以具有多个IPC,并在由630类组成的IPC子类级别分类专利文档。使用韩国的564,793注册专利检查专利文献结构领域的效果。使用两个字段主要获得87.2%的精确率。从这个序列中,验证了技术领域和背景领域在提高IPC子类级别的IPC多标签分类的精度方面发挥着重要作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号