首页> 外文OA文献 >Modular resource development and diagnostic evaluation framework for fast NLP system improvement
【2h】

Modular resource development and diagnostic evaluation framework for fast NLP system improvement

机译:用于快速NLp系统改进的模块化资源开发和诊断评估框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Natural Language Processing systems are large-scale softwares, whose development involves many man-years of work, in terms of both coding and resource development. Given a dictionary of 110k lemmas, a few hundred syntactic analysis rules, 20k ngrams matrices and other resources, what will be the impact on a syntactic analyzer of adding a new possible category to a given verb? What will be the consequences of a new syntactic rules addition? Any modification may imply, besides what was expected, unforeseeable side-effects and the complexity of the system makes it difficult to guess the overall impact of even small changes. We present here a framework designed to effectively and iteratively improve the accuracy of our linguistic analyzer LIMA by iterative refinements of its linguistic resources. These improvements are continuously assessed by evaluating the analyzer performance against a reference corpus. Our first results show that this framework is really helpful towards this goal.
机译:自然语言处理系统是大规模的软件,其开发涉及编码和资源开发两个方面的工作。给定一个包含11万个引理的字典,数百个句法分析规则,2万个ngram矩阵和其他资源,给给定动词添加新的可能类别对句法分析器会有什么影响?添加新的语法规则会有什么后果?除了预期之外,任何修改都可能意味着无法预料的副作用,而且系统的复杂性使得很难猜测即使是很小的更改的总体影响。我们在这里提出一个框架,该框架旨在通过迭代优化其语言资源来有效且迭代地提高我们的语言分析器LIMA的准确性。通过根据参考语料库评估分析仪的性能来不断评估这些改进。我们的第一个结果表明,该框架确实有助于实现该目标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号