首页> 外文会议>International Conference on Information Retrieval and Knowledge Management >Effectiveness of Latent Dirichlet Allocation Model for Semantic Information Retrieval on Malay Document
【24h】

Effectiveness of Latent Dirichlet Allocation Model for Semantic Information Retrieval on Malay Document

机译:潜在狄利克雷分配模型对马来文语义信息检索的有效性

获取原文

摘要

Current research usually adopts the standard process of Vector Space Model (VSM) in searching and retrieving information on Malay documents. However, this technique is less effective for semantic information retrieval from the collection. The system will only retrieve documents which contain the user's query terms and ignore semantic information among those terms. Therefore, several documents that have similar context are ignored and several document context that share a single term are retrieved. Due to this problem, Latent Dirichlet Allocation (LDA) model is applied for semantic information retrieval on Malay documents. An experiment was illustrated based on 6 queries text and 50 Hadith documents translated in Malay language, composed of Shahih Bukhari collections. Experimental results proved that the LDA model gives promising results in retrieving semantic information in Malay translated Hadith documents compare to existing techniques. Some limitation from this study can be explored for future work in order to improve the effectiveness of the retrieval results. Overall, LDA is an effective method for semantic information retrieval on Malay document, thus, it can help people to easily search and retrieve semantic information from Malay documents.
机译:当前的研究通常采用向量空间模型(VSM)的标准过程来搜索和检索马来文档中的信息。但是,此技术对于从集合中检索语义信息不太有效。系统将仅检索包含用户查询词的文档,而忽略这些词中的语义信息。因此,将忽略具有相似上下文的多个文档,并检索共享单个术语的多个文档上下文。由于这个问题,潜在狄利克雷分配(LDA)模型被用于对马来文文档进行语义信息检索。根据6个查询文本和50个马来语翻译的圣训文档(由Shahih Bukhari馆藏)说明了一个实验。实验结果证明,与现有技术相比,LDA模型在检索马来语已翻译的Hadith文档中的语义信息方面提供了可喜的结果。为了提高检索结果的有效性,可以为将来的工作探索这项研究的某些局限性。总体而言,LDA是在马来文文档上检索语义信息的有效方法,因此,它可以帮助人们轻松地从马来文文档中搜索和检索语义信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号