首页> 外文会议>International Conference on Data Mining >Semi-automatic Metadata Extraction from Scientific Journal Article for Full-text XML Conversion
【24h】

Semi-automatic Metadata Extraction from Scientific Journal Article for Full-text XML Conversion

机译:用于全文XML转换的科学期刊文章的半自动元数据

获取原文

摘要

By the increasing continuous academic researches, the volume of scientific articles has dramatically reached unpredictable level. To facilitate archive and publication, many scientific journals in Korea are actively adapting Open Access (OA) policy. In addition, it has more attractable than commercial printing of companies that freely provide the full text of article published in scholarly journal through web to user. Because of difficulty to convert automatically unstructured format such as pdf document into full-text, which is structured with accuracy, the most full text conversion works in scholarly journal publisher have been conducted with human interaction. To deal with the problem of low reliability in the automatic metadata extraction and help with minimum human interaction, we propose semi-automated metadata extraction method based on rule-based method and machine learning method. In this experiment, we verified the performance under 26 different journals in Open Access Korea Central (OAK Central). We only cover two part (elements of front and back) as part of an effort to convert full-text xml based on JATS v1.0. As a result, our proposed method reached F1 = 94.1% in front and F1 = 92.5% in back.
机译:通过不断增加的持续学术研究,科学文章的数量显着达到了不可预测的水平。为了促进档案和出版,韩国的许多科学期刊正在积极调整开放访问(OA)政策。此外,它还比商业印刷更容易吸引,公司通过Web向用户自由提供在学术期刊上发布的文章的全文。由于难以将PDF文档等自动转换为全文,这是以准确性的完整文本转换为完整的文本,所以在学术期刊出版商中最完整的文本转换工作已经进行了人为互动。为了处理自动元数据提取和最低人类交互的可靠性问题,我们提出了基于规则的方法和机器学习方法的半自动元数据提取方法。在这个实验中,我们核实了在韩国中央(橡树中央)的26间不同期刊下的表现。我们只涵盖两个部分(前后元素),作为基于JATS v1.0转换全文XML的努力的一部分。结果,我们所提出的方法达到F1 =前后94.1%,而F1 = 92.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号