...
首页> 外文期刊>Procedia Computer Science >Training and Evaluating a Statistical Part of Speech Tagger for Natural Language Applications using Kepler Workflows
【24h】

Training and Evaluating a Statistical Part of Speech Tagger for Natural Language Applications using Kepler Workflows

机译:使用开普勒工作流为自然语言应用训练和评估语音标注器的统计部分

获取原文
   

获取外文期刊封面封底 >>

       

摘要

A core technology of natural language processing (NLP) incorporated into many text processing applications is a part of speech (POS) tagger, a software component that labels words in text with syntactic tags such as noun, verb, adjective, etc. These tags may then be used within more complex tasks such as parsing, question answering, and machine translation (MT). In this paper we describe the phases of our work training and evaluating statistical POS taggers on Arabic texts and their English translations using Kepler workflows. While the original objectives for encapsulating our research code within Kepler workflows were driven by software engineering needs to document and verify the re usability of our software, our research benefitted as well: the ease of rapid retraining and testing enabled our researchers to detect reporting discrepancies, document their source, independently validating the correct results.
机译:集成到许多文本处理应用程序中的自然语言处理(NLP)的一项核心技术是语音(POS)标记器的一部分,该软件组件使用诸如名词,动词,形容词等语法标记来标记文本中的单词。这些标记可能然后用于更复杂的任务,例如解析,问题解答和机器翻译(MT)。在本文中,我们描述了我们的工作培训阶段以及使用开普勒工作流评估阿拉伯文及其英语翻译的统计POS标签的阶段。将研究代码封装在开普勒工作流程中的最初目标是由软件工程记录和验证我们软件的可重用性的需求所驱动的,我们的研究也从中受益:快速重新培训和测试的便捷性使研究人员能够发现报告差异,记录其来源,独立验证正确的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号