...
首页> 外文期刊>Journal of Theoretical and Applied Information Technology >QUESTION ANSWERING SYSTEM SUPPORTING VECTOR MACHINE METHOD FOR HADITH DOMAIN
【24h】

QUESTION ANSWERING SYSTEM SUPPORTING VECTOR MACHINE METHOD FOR HADITH DOMAIN

机译:HADITH域的问题回答系统支持矢量机方法

获取原文

摘要

Retrieving accurate answers based on users query is the main issue of question answering systems. Challenges such as analyse the need of users query and extract accurate answers from large corpus are increase the difficulty of developing effective question answering system. This work aims to enhance the accuracy of question answering system for hadiths using useful methods. Pre-processing methods like tokenization and stop-word removal is used to identify the main concepts of users query. Answering processing methods and techniques like N-gram, WordNet, CS, and LCS are used to update and enrich the extracted concepts of users query based on the formal representation of hadiths answers or documents. Support Vector Machine (SVM) and Name Entity Recognition (NER) methods are conducted to classify Hadiths documents based on relevant subjects and questions types in order to reduce the searching scope of answers documents. Documents in Hadith corpus are classified according to proposed question types, and related subjects as four main classes which are: when for pray, where for pray, when for fasting, and where for fasting. The SVM classification of documents is accomplished supporting NER methods to identify the places (where) and time (when) features that included in the documents. The proposed question answering system is tested using 132 Hadiths documents about Fasting and Pray that are selected from Al-Bukhari source. The findings revealed that the average answers accuracy using CS technique is 67%, the average answers accuracy using LCS technique is 66%, the average answers accuracy using combination of CS and LCS techniques is 70%, and the average answers accuracy using CS, LCS, and SVM is 80%. SVM enhance the system accuracy up to 10% more than using other methods without classification processes. The main contribution of this research is using SVM method to reduce searching scope of Hadiths documents based on various subjects and question types beside effective analysis of query need using NLP methods. SVM provides more accurate answers than extracting answers using only similarity techniques such as CS and LCS.
机译:基于用户查询来获取准确答案是问答系统的主要问题。分析用户查询需求并从大型语料库中提取准确答案之类的挑战增加了开发有效问答系统的难度。这项工作旨在使用有用的方法来提高圣训问答系统的准确性。诸如标记化和停用词删除之类的预处理方法用于标识用户查询的主要概念。回答处理方法和技术(例如N-gram,WordNet,CS和LCS)用于更新和丰富基于圣训答案或文档的形式表示的用户查询概念。进行支持向量机(SVM)和名称实体识别(NER)方法,根据相关主题和问题类型对圣训文档进行分类,以减小答案文档的搜索范围。圣训语料库中的文档根据提出的问题类型和相关主题分为四个主要类别:何时祈祷,何时祈祷,何时禁食,何处禁食。支持NER方法以完成文档的SVM分类,以识别文档中包含的地点(位置)和时间(时间)特征。使用从Al-Bukhari来源中选取的有关斋戒和祈祷的132个Hadiths文档对提议的问答系统进行了测试。调查结果显示,使用CS技术的平均答案准确性为67%,使用LCS技术的平均答案准确性为66%,使用CS和LCS技术组合的平均答案准确性为70%,使用CS,LCS的平均答案准确性,并且SVM为80%。与没有分类过程的其他方法相比,SVM将系统精度提高了多达10%。这项研究的主要贡献在于,除了使用NLP方法对查询需求进行有效分析外,SVM方法还可以缩小基于各种主题和问题类型的圣训文档的搜索范围。与仅使用诸如CS和LCS的相似性技术提取答案相比,SVM提供了更准确的答案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号