首页> 外文期刊>Procedia Computer Science >Ontology based Semantic Annotation of Urdu Language Web Documents
【24h】

Ontology based Semantic Annotation of Urdu Language Web Documents

机译:基于本体的乌尔都语网络文档语义标注

获取原文
       

摘要

Proliferation of multilingual text on the Internet has increased the demand for efficient information retrieval independent of language. Among variety of languages, the Urdu language is one of the most commonly spoken and written language in South Asia. However, due to unstructured format the access of relevant information is still a big challenge. The semantic web technologies enable the advancement in information retrieval systems by assigning semantics to information. This paper presents a semantic annotation framework that can annotate documents written in Urdu language. The framework uses domain specific ontology and context keywords instead of NLP (Natural Language processing) techniques. The experiment has been conducted to evaluate the presented annotation framework. The set of corpora used in the experiment belong to the online classified ads posted on the online Urdu newspapers. The purpose of this research is to find the challenges involved in semantic annotation of Urdu language web documents.
机译:Internet上多语言文本的激增增加了对独立于语言的有效信息检索的需求。在多种语言中,乌尔都语是南亚最常见的口头和书面语言之一。但是,由于格式非结构化,相关信息的访问仍然是一个很大的挑战。语义Web技术通过为信息分配语义来实现信息检索系统的进步。本文提出了一种语义注释框架,可以对用乌尔都语编写的文档进行注释。该框架使用特定于域的本体和上下文关键字来代替NLP(自然语言处理)技术。实验已经进行以评估提出的注释框架。实验中使用的语料集属于在线乌尔都语报纸上发布的在线分类广告。这项研究的目的是发现乌尔都语网络文档的语义注释所涉及的挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号