首页> 外文会议>Malaysian Software Engineering Conference >Support vector machine based approach for quranic words detection in online textual content
【24h】

Support vector machine based approach for quranic words detection in online textual content

机译:基于支持向量机的在线文本内容古兰经词检测方法

获取原文

摘要

Quran is the holy book for Muslims around the world. Since it was revealed to the Prophet Muhammad (PBUH) before about 14 hundreds years, Quran is preserved in all imaginable ways from distortion. The rapid and huge growth of digital media and internet usage, cause a wide spread of the Quranic knowledge as well as Quranic Verses, scripts, Translations, and many other Quranic sciences in its digital formats. Some of the online sources, websites, services and social network users are introducing a less authentic Quranic content, services and applications. The ordinary user of such online resources could not detect and authenticate the provided Quranic verses. In this paper, we propose a machine learning approach to detect Quranic words in a text extracted from online sources. The proposed approach of detection utilizes Support Vector Machine to generate a learning model of Quranic words by training the learner on the Quranic words dataset. The generated classification model is used later to classify the words from online content. Experiments based on different features categories such as the Diacritics, and Statistical features are performed and a prototype is developed, Results show that the accuracy and other evaluation measurements achieved by the proposed approach are higher than the previous measurement in the domain. The Future works will focus on incorporating more machine learning and optimization techniques for achieving higher evaluation measurements.
机译:古兰经》是世界各地穆斯林的圣书。自古兰经在大约一千四百年前被揭示给先知穆罕默德(PBUH)之后,古兰经就以各种可以想象的方式被保存下来,以防变形。数字媒体和互联网使用的迅猛增长,导致古兰经知识以及其数字格式的古兰经经文,剧本,翻译和许多其他古兰经科学得到广泛传播。一些在线资源,网站,服务和社交网络用户正在引入不太真实的古兰经内容,服务和应用程序。这种在线资源的普通用户无法检测和验证所提供的古兰经经文。在本文中,我们提出了一种机器学习方法来检测从在线资源中提取的文本中的古兰经词。所提出的检测方法利用支持向量机通过在古兰经单词数据集上对学习者进行训练来生成古兰经单词的学习模型。生成的分类模型稍后将用于对在线内容中的单词进行分类。进行了基于不同特征类别(例如变音符号和统计特征)的实验,并开发了原型,结果表明,所提方法实现的准确性和其他评估测量结果均高于该领域中的先前测量结果。未来的工作将集中于合并更多的机器学习和优化技术,以实现更高的评估度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号