首页> 外文会议>ICT Innovations Conference >Using NLP Methods to Improve the Effectiveness of a Macedonian Question Answering System
【24h】

Using NLP Methods to Improve the Effectiveness of a Macedonian Question Answering System

机译:使用NLP方法来提高Macedonian问题应答系统的有效性

获取原文

摘要

The process of retrieving particular information from a huge amount of text significantly depends on the language specific features. The most imposing one for Macedonian language is the possibility of a word to have various derivational and inflectional suffixes. In this research we investiage how particular NLP tools influence the retrieval, putting special emphasis on the use of Part-of-Speech (PoS) tagging, word forms, and stemming. In absence of a stemming algorithm for Macedonian language, we used the Dice Coefficient and the single-link clustering in order to group words with a common base form. All these features were implemented in an already existing Macedonian Question Answering System (QAS). We tested different strategies for weighting terms in the documents (the queries), as well as different approaches for query expansion with word forms and words with the same stem. The experimental results show that the word variations strongly influence the retrieval, improving our system's accuracy.
机译:从大量文本中检索特定信息的过程显着取决于语言特定功能。 Macedonian语言最重要的是有可能有各种衍生和拐服后缀的词。在这项研究中,我们研究了特定的NLP工具对检索的影响,特别强调使用言语(POS)标记,Word Forms和Stemming的使用。在没有用于马其顿语言的遗工算法的情况下,我们使用了骰子系数和单链路聚类,以便以共同的基本形式对单词进行分组。所有这些功能都是在现有的Macedonian问题应答系统(QAS)中实现的。我们测试了在文档(查询)中的加权术语的不同策略,以及使用单词形式和具有相同词干的单词的查询扩展的不同方法。实验结果表明,变化词强烈影响了检索,提高了我们的系统的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号