【24h】

Semantics and the crowd

机译:语义学与人群

获取原文
           

摘要

One of the principal scientific challenges that drives my group is to understand the character of formal knowledge on the Web. By formal knowledge, I mean information that is represented on the Web in something other than natural language text—typically, as machine-readable Web data with a formal syntax and a specific, intended semantics. The Web provides a major counterpoint to our traditional artificial intelligence (AI) based accounts of formal knowledge. Most symbolic AI systems are designed to address sophisticated logical inference over coherent conceptual knowledge, and thus the underlying research is focused on characterizing formal properties such as entailment relations, time/space complexity of inference, monotonicity, and expressiveness. In contrast, the Semantic Web allows us to explore formal knowledge in a very different context, where data representations exist in a constantly changing, large-scale, highly distributed network of loosely-connected publishers and consumers, and are governed by a Web-derived set of social practices for discovery, trust, reliability, and use. We are particularly interested in understanding how large-scale Semantic Web data behaves over longer time periods: the way by which its producers and consumers shift their requirements over time; how uniform resource identifiers (URIs) are used to dynamically link knowledge together; and the overall lifecycle of Web data from publication, to use, integration with other knowledge, evolution, and eventual deprecation. We believe that understanding formal knowledge in this Web context is the key to bringing existing AI insights and knowledge bases to the level of scale and utility of the current hypertext Web. Technically, the scalability of the Semantic Web is rooted in a large number of independently-motivated participants with a shared vision, each following a set of carefully-designed common protocols and representation languages (principally dialects of the Resource Description Framework (RDF), the Web Ontology Language (OWL), and the SPARQL Protocol and RDF Query Language (SPARQL)) that run on top of the standard Web server and browser infrastructure. This strategy builds on the familiar hypertext Web, and has been incredibly successful. The Semantic Web now encompasses more than 50 billion Semantic Web assertions (triples) shared across the world via large numbers of autonomous Web servers, processed by situation-specific combinations of local and remote logic engines, and consumed by a shifting collection of software and users. However, this kind of loosely-coupled scalability strategy comes at a technical price: the Semantic Web is by far the largest formal knowledge base on the planet, and certainly one of the broadest, but also one of the messiest. Semantic coherence can be guaranteed only locally if at all, performance is spotty, data updates are unpredictable, and the raw data can be problematic in many ways. These problems impact the overall scalability of the Semantic Web; beyond simply exchanging large quantities of data, we also want the Semantic Web to scalably support queries, integration, rules, and other data processing tools. If we can solve these problems, though, the Semantic Web promises an exciting new kind of data Web, with practical scaling properties beyond what federated database technology can achieve. In the full Semantic Web vision, massive amounts of partially-integrated data form a dynamically shifting fabric of on-demand information, able to be published and consumed by clients around the world, with transformational impact. Our current work is inspired by two properties of the Semantic Web: how existing Internet social (‘crowd’) phenomena can apply to data on the Semantic Web, and how we can use these social Web techniques to improve the dynamic scalability of the Semantic Web. Most data currently published on the Semantic Web is originally sourced from existing relational databases, either via front-end syst
机译:推动我的团队前进的主要科学挑战之一是了解Web上形式知识的特征。形式知识是指以自然语言文本以外的其他形式在Web上表示的信息,通常是具有形式语法和特定预期语义的机器可读Web数据。 Web提供了与基于传统人工智能(AI)的形式知识说明的主要对立。大多数符号AI系统旨在解决有关连贯概念知识的复杂逻辑推理,因此基础研究重点在于表征形式属性,例如蕴含关系,推理的时空复杂性,单调性和表达性。相比之下,语义Web允许我们在非常不同的上下文中探索形式知识,其中数据表示形式存在于由松散连接的发布者和消费者组成的不断变化的,大规模,高度分布式的网络中,并且受基于Web的支配用于发现,信任,可靠性和使用的一系列社会实践。我们特别想了解大型语义Web数据在较长时间段内的行为:生产者和消费者随时间变化其需求的方式;如何使用统一资源标识符(URI)将知识动态链接在一起;以及Web数据从发布到使用,与其他知识的集成,演化和最终弃用的整个生命周期。我们相信,在这种Web上下文中理解形式知识是将现有的AI见解和知识库提高到当前超文本Web的规模和实用性水平的关键。从技术上讲,语义Web的可伸缩性植根于具有共同愿景的大量独立动机的参与者,每个参与者都遵循一组精心设计的通用协议和表示语言(主要是资源描述框架(RDF)的方言,在标准Web服务器和浏览器基础结构上运行的Web本体语言(OWL),SPARQL协议和RDF查询语言(SPARQL)。该策略建立在熟悉的超文本Web上,并且取得了令人难以置信的成功。语义Web现在包含通过大量的自主Web服务器在世界范围内共享的超过500亿个语义Web断言(三元组),它们由特定情况的本地和远程逻辑引擎组合处理,并被不断变化的软件和用户使用。但是,这种松散耦合的可伸缩性策略要付出技术上的代价:语义Web迄今为止是地球上最大的形式化知识库,并且当然是范围最广,但也是最混乱的知识库之一。只有在所有方面,性能参差不齐,数据更新不可预测以及原始数据可能存在很多问题的情况下,才能确保本地语义一致性。这些问题影响了语义Web的总体可伸缩性。除了简单地交换大量数据外,我们还希望语义网可扩展地支持查询,集成,规则和其他数据处理工具。但是,如果我们能够解决这些问题,那么语义Web有望带来一种令人兴奋的新型数据Web,其实用的扩展属性超出了联合数据库技术所能达到的范围。在完整的语义Web愿景中,大量的部分集成数据形成了动态变化的按需信息结构,能够被世界各地的客户发布和使用,并具有变革性的影响。我们当前的工作受到语义Web的两个属性的启发:现有Internet社交(“人群”)现象如何应用于语义Web上的数据,以及我们如何使用这些社交Web技术来改进语义Web的动态可伸缩性。当前在语义Web上发布的大多数数据最初都是通过前端系统从现有关系数据库中获取的

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号