【24h】

Retrieval of XML data - to support NLP applications

机译:检索XML数据-支持NLP应用程序

获取原文

摘要

Information retrieval (IR) deals with the organization, storage, representation and access to information. An XML document is a database only in the strictest sense of the term. That it is a collection of data. As a database XML has some advantages like it is self-describing, portable & it can describe data in tree or graph structures. At the same time it is considered that XML is verbose and access to data is slow due to parsing & text conversion. Even though it is tough to provide efficient storage, indexes, security, transactions & data integrity, multi-user access, triggers, queries across multiple documents etc. with XML data repository, the application of XML in present & next generation applications and the huge amount of XML data available round the world can not be ignored. So retrieving XML data efficiently - is always a challenge. Hybrid Data Servers, SQL, XPath and WQuery have opened a new era to handle both relational and XML data. XML, has also become the standard framework for publishing on the net, as well as the standard e-commerce language to build B2B and B2C Web services. A major concern for this scenario is the "point of creation" bottleneck, at which creating useful, well-structured XML data can consume unduly amount of time and effort. , XML helps the NLP researches, especially the ones with annotated corpus based approaches, by providing them with the knowledge representation frameworks for morphological, syntactic, semantics and/or pragmatics information structure of NL resources. In many cases, XML is able to provide NLP with deeper semantic structure clues and thus realize much more robust, higher precision NLP applications. This paper focuses on the retrieval techniques of XML databases for present and next generation NLP applications.
机译:信息检索(IR)涉及信息的组织,存储,表示和访问。 XML文档仅在严格意义上是一个数据库。那是数据的集合。由于数据库XML具有一些优点,例如它具有自描述性,可移植性,并且可以用树或图结构描述数据。同时,由于解析和文本转换,XML被认为是冗长的,并且对数据的访问速度很慢。即使很难通过XML数据存储库,XML在当前和下一代应用程序中的应用以及庞大的数据库来提供有效的存储,索引,安全性,事务和数据完整性,多用户访问,触发器,跨多个文档的查询等世界各地可用的XML数据量不容忽视。因此,有效地检索XML数据始终是一个挑战。混合数据服务器,SQL,XPath和WQuery开辟了处理关系数据和XML数据的新时代。 XML也已成为在网上发布的标准框架,以及用于构建B2B和B2C Web服务的标准电子商务语言。这种情况的主要关注点是“创建点”瓶颈,在该瓶颈上创建有用的,结构良好的XML数据可能会花费过多的时间和精力。 XML通过为NLP研究提供NL资源的形态,句法,语义和/或语用信息结构的知识表示框架,从而帮助NLP研究,尤其是那些使用基于注释的语料库的方法。在许多情况下,XML能够为NLP提供更深层的语义结构线索,从而实现更健壮,更高精度的NLP应用程序。本文重点介绍用于当前和下一代NLP应用程序的XML数据库的检索技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号