首页> 外文会议>IEEE International Conference on Web Services >A Novel Part of Speech Tagging Framework for NLP Based Business Process Management
【24h】

A Novel Part of Speech Tagging Framework for NLP Based Business Process Management

机译:基于NLP的业务流程管理的新型语音标记框架

获取原文

摘要

Natural Language Processing (NLP) is a key technique to automate Business Process Management (BPM) at different levels. The performance of existing NLP based BPM methods suffer from the limited accuracy of Part of Speech (POS) tagging, which is a key step in NLP pipelines. Note that the performance of POS tagging highly depends on the domain of annotated training data. However, most state-of-the-art POS taggers are trained from corpus in newswire domain which usually have different syntax features with business process description (BPD). The syntax features of BPD domain include usually starting with an imperative verb and containing numerous out-of-vocabulary (OOV) words. In this paper, we propose a novel POS tagging framework to tackle these problems. The main idea is that syntax feature of starting with imperative verb could be studied by enhancing the proportion of correctly POS-annotated imperative sentences in the training data. The trained POS tagger could reduce the overall POS tagging error by nearly 12% compared with newswire trained POS tagger. For verbs which are key words in BPD, the tagging precision could be increased by 27%. The lexical ambiguity caused by OOV words is solved by extracting local contextual knowledge out of images which are attached to help users understand the process better. Experimental results show that the overall POS tagging accuracy could be increased by nearly 10% with contextual OOV knowledge.
机译:自然语言处理(NLP)是在不同级别上自动化业务流程管理(BPM)的一项关键技术。现有的基于NLP的BPM方法的性能受到词性(POS)标记的有限准确性的影响,这是NLP管道中的关键步骤。请注意,POS标记的性能高度依赖于带注释的训练数据的域。但是,大多数最新的POS标记器都是从新闻专线领域的语料库训练而来的,它们通常具有与业务流程描述(BPD)相同的语法功能。 BPD域的语法特征通常包括命令式动词开头,并包含许多词汇外(OOV)词。在本文中,我们提出了一种新颖的POS标记框架来解决这些问题。主要思想是可以通过提高训练数据中正确标注POS的命令式句子的比例来研究命令式动词开头的语法特征。与新闻通讯社训练过的POS标记器相比,训练有素的POS标记器可以将总POS标记错误减少近12%。对于BPD中的关键词动词,标注精度可以提高27%。通过从图像中提取局部上下文知识,可以解决由OOV单词引起的词汇歧义,该图像被附加以帮助用户更好地理解该过程。实验结果表明,借助上下文OOV知识,可以将整体POS标记的准确性提高近10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号