首页> 外文期刊>International journal of advanced intelligence paradigms >Statistical pair pruning towards target class in learning-based anaphora resolution for Tamil
【24h】

Statistical pair pruning towards target class in learning-based anaphora resolution for Tamil

机译:泰米尔语基于学习的回指解析中针对目标类别的统计对修剪

获取原文
获取原文并翻译 | 示例
           

摘要

Anaphora resolution is an important task to be achieved in many natural language understanding (NLU) applications including machine translation. This paper proposes learning-based system to resolve pronouns in Tamil text built around various classification algorithms. To improve learning accuracy, the system is built in two folds. First is feature vector production where mentions are identified, characterised then a feature vectors of lexical, syntactic and semantic features are produced. Next is the pair pruning module where, number of non-target class pairs is reduced by deep statistical analysis of feature vector. Incorporating deeper pair pruning module dramatically increases the f-measure score when compared to training the same models but without the pruning module. On the tourism dataset of TDIL we trained the system with various classification algorithms and obtained encouraging results for a challenging language, Tamil. We discuss how varying the ratio of f-measure, precision and recall is between with and without the pruning module in comparative model.
机译:回指解析是许多自然语言理解(NLU)应用程序(包括机器翻译)中要完成的重要任务。本文提出了一种基于学习的系统来解决围绕各种分类算法构建的泰米尔语文本中的代词。为了提高学习准确性,该系统分为两部分。首先是特征向量的产生,其中要识别出提及,然后对其进行特征化,然后产生词汇,句法和语义特征的特征向量。接下来是对修剪模块,其中通过对特征向量进行深度统计分析来减少非目标类对的数量。与训练相同模​​型但不使用修剪模块的情况相比,合并更深的对修剪模块可以显着提高f-measure得分。在TDIL的旅游数据集上,我们使用各种分类算法训练了该系统,并获得了具有挑战性的语言泰米尔语的令人鼓舞的结果。我们讨论了在比较模型中使用和不使用修剪模块时,如何在f量度,精度和查全率之间改变比率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号