首页> 外文期刊>中国文献情报(英文刊) >A method for improving the accuracy of automatic indexing of Chinese-English mixed documents
【24h】

A method for improving the accuracy of automatic indexing of Chinese-English mixed documents

机译:一种提高中英文混合文档自动索引准确性的方法

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose:The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents.Design/methodology/approach:Based on the inherent characteristics of Chinese-English mixed texts and the cybernetics theory,we proposed an integrated control method for indexing documents.It consists of "feed-forward control","in-progress control" and "feed-back control",aiming at improving the accuracy of automatic indexing of Chinese-English mixed documents.An experiment was conducted to investigate the effect of our proposed method.Findings:This method distinguishes Chinese and English documents in grammatical structures and word formation rules.Through the implementation of this method in the three phases of automatic indexing for the Chinese-English mixed documents,the results were encouraging.The precision increased from 88.54% to 97.10% and recall improved from 97.37% to 99.47%.Research limitations:The indexing method is relatively complicated and the whole indexing process requires substantial human intervention.Due to pattern matching based on a bruteforce (BF) approach,the indexing efficiency has been reduced to some extent.Practical implications:The research is of both theoretical significance and practical value in improving the accuracy of automatic indexing of multilingual documents (not confined to Chinese-English mixed documents).The proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas.Originality/value:So far,few studies have been published about the method for increasing the accuracy of multilingual automatic indexing.This study will provide insights into the automatic indexing ofmultilingual documents,especially Chinese-English mixed documents.
机译:目的:本文的目的是提出一种提高汉英混合文本自动索引准确性的方法。设计/方法/方法:基于汉英混合文本的内在特征和控制论,提出一种集成的文档索引控制方法,由“前馈控制”,“进行中控制”和“反馈控制”组成,旨在提高中英文混合文档自动索引的准确性。结果:该方法在语法结构和构词规则上区分了中英文文档。通过在汉英混合文档自动索引的三个阶段实施该方法,结果精度从88.54%提高到97.10%,召回率从97.37%提高到99.47%。研究局限性:索引方法相对复杂ed和整个索引过程都需要大量的人工干预。由于基于蛮力(BF)方法的模式匹配,索引效率有所降低。实际意义:该研究对改善索引质量具有理论意义和实践价值。多种语言文档自动索引的准确性(不限于中英文混合文档)。该方法不仅有益于生命科学文档的索引编制,而且也有益于其他学科领域的文档索引。原文/价值:到目前为止,很少有关提高多语言自动索引准确性的方法的研究已经发表。该研究将为多语言文档尤其是中英文混合文档的自动索引提供见解。

著录项

  • 来源
    《中国文献情报(英文刊)》 |2012年第004期|77-92|共16页
  • 作者

    Yan ZHAO; Hui SHI;

  • 作者单位

    College of International Business, Shanghai International Studies University,Shanghai 200083, China;

    Center for E-government Internationalization Research, Shanghai International Studies University, Shanghai 200083, China;

    College of English Language and Literature, Shanghai International Studies University, Shanghai 200083, China;

    Department of Foreign Languages, Taiyuan Normal University, Shanxi 030012, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-18 04:36:00
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号