首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges
【24h】

An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges

机译:从非结构化文本到半结构化数据的XML语义歧义概述:背景,应用程序和持续的挑战

获取原文
获取原文并翻译 | 示例
           

摘要

Since the last two decades, XML has gained momentum as the standard for web information management and complex data representation. Also, collaboratively built semi-structured information resources, such as Wikipedia, have become prevalent on the Web and can be inherently encoded in XML. Yet most methods for processing XML and semi-structured information handle mainly the syntactic properties of the data, while ignoring the semantics involved. To devise more intelligent applications, one needs to augment syntactic features with machine-readable semantic meaning. This can be achieved through the computational identification of the meaning of data in context, also known as (a.k.a.) automated semantic analysis and disambiguation, which is nowadays one of the main challenges at the core of the Semantic Web. This survey paper provides a concise and comprehensive review of the methods related to XML-based semi-structured semantic analysis and disambiguation. It is made of four logical parts. First, we briefly cover traditional word sense disambiguation methods for processing flat textual data. Second, we describe and categorize disambiguation techniques developed and extended to handle semi-structured and XML data. Third, we describe current and potential application scenarios that can benefit from XML semantic analysis, including: data clustering and semantic-aware indexing, data integration and selective dissemination, semantic-aware and temporal querying, web and mobile services matching and composition, blog and social semantic network analysis, and ontology learning. Fourth, we describe and discuss ongoing challenges and future directions, including: the quantification of semantic ambiguity, expanding XML disambiguation context, combining structure and content, using collaborative/social information sources, integrating explicit and implicit semantic analysis, emphasizing user involvement, and reducing computational complexity.
机译:自从过去的二十年以来,XML成为了Web信息管理和复杂数据表示的标准,已成为一种动力。同样,协作构建的半结构化信息资源(例如Wikipedia)已经在Web上流行,并且可以固有地以XML进行编码。然而,大多数处理XML和半结构化信息的方法主要处理数据的句法属性,而忽略了所涉及的语义。为了设计更智能的应用程序,需要使用机器可读的语义来扩展语法功能。这可以通过对上下文中数据含义的计算识别来实现,也称为自动语义分析和歧义消除,这是当今语义网核心的主要挑战之一。本调查报告简要概述了与基于XML的半结构化语义分析和歧义消除相关的方法。它由四个逻辑部分组成。首先,我们简要介绍用于处理纯文本数据的传统单词歧义消除方法。其次,我们描述并分类了消歧技术,该消歧技术是为处理半结构化和XML数据而开发和扩展的。第三,我们描述了可以从XML语义分析中受益的当前和潜在的应用场景,包括:数据聚类和语义感知索引,数据集成和选择性分发,语义感知和时间查询,Web和移动服务匹配与组合,博客和社会语义网络分析和本体学习。第四,我们描述并讨论了当前面临的挑战和未来方向,包括:语义歧义的量化,扩展XML歧义歧义上下文,结合结构和内容,使用协作/社交信息源,集成显式和隐式语义分析,强调用户参与以及减少计算复杂度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号