Retrieval of XML data - to support NLP applications

机译：检索XML数据-支持NLP应用程序

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information retrieval (IR) deals with the organization, storage, representation and access to information. An XML document is a database only in the strictest sense of the term. That it is a collection of data. As a database XML has some advantages like it is self-describing, portable & it can describe data in tree or graph structures. At the same time it is considered that XML is verbose and access to data is slow due to parsing & text conversion. Even though it is tough to provide efficient storage, indexes, security, transactions & data integrity, multi-user access, triggers, queries across multiple documents etc. with XML data repository, the application of XML in present & next generation applications and the huge amount of XML data available round the world can not be ignored. So retrieving XML data efficiently - is always a challenge. Hybrid Data Servers, SQL, XPath and WQuery have opened a new era to handle both relational and XML data. XML, has also become the standard framework for publishing on the net, as well as the standard e-commerce language to build B2B and B2C Web services. A major concern for this scenario is the "point of creation" bottleneck, at which creating useful, well-structured XML data can consume unduly amount of time and effort. , XML helps the NLP researches, especially the ones with annotated corpus based approaches, by providing them with the knowledge representation frameworks for morphological, syntactic, semantics and/or pragmatics information structure of NL resources. In many cases, XML is able to provide NLP with deeper semantic structure clues and thus realize much more robust, higher precision NLP applications. This paper focuses on the retrieval techniques of XML databases for present and next generation NLP applications.

机译：信息检索（IR）涉及信息的组织，存储，表示和访问。 XML文档仅在严格意义上是一个数据库。那是数据的集合。由于数据库XML具有一些优点，例如它具有自描述性，可移植性，并且可以用树或图结构描述数据。同时，由于解析和文本转换，XML被认为是冗长的，并且对数据的访问速度很慢。即使很难通过XML数据存储库，XML在当前和下一代应用程序中的应用以及庞大的数据库来提供有效的存储，索引，安全性，事务和数据完整性，多用户访问，触发器，跨多个文档的查询等世界各地可用的XML数据量不容忽视。因此，有效地检索XML数据始终是一个挑战。混合数据服务器，SQL，XPath和WQuery开辟了处理关系数据和XML数据的新时代。 XML也已成为在网上发布的标准框架，以及用于构建B2B和B2C Web服务的标准电子商务语言。这种情况的主要关注点是“创建点”瓶颈，在该瓶颈上创建有用的，结构良好的XML数据可能会花费过多的时间和精力。 XML通过为NLP研究提供NL资源的形态，句法，语义和/或语用信息结构的知识表示框架，从而帮助NLP研究，尤其是那些使用基于注释的语料库的方法。在许多情况下，XML能够为NLP提供更深层的语义结构线索，从而实现更健壮，更高精度的NLP应用程序。本文重点介绍用于当前和下一代NLP应用程序的XML数据库的检索技术。

著录项

来源
《Proceedings of the 2007 International Conference on Artificial Intelligence(ICAI'2007)》|2007年|P.609614618|共3页
会议地点
作者
Siddhartha Ghosh; Sameen S Fatima;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
XML; NLP; XPath; XQuery; hybrid data servers;

机译：XML; NLP; XPath; XQuery;混合数据服务器;

相似文献

外文文献
中文文献
专利

1. Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database [J] . JOVAN PEHCEVSKI, JAMES A. THOM, ANNE-MARIE VERCOUSTRE Information retrieval . 2005,第4期

机译：混合XML检索：结合信息检索和本机XML数据库
2. A Comparison and Benchmarking on Data Storage and Query Retrieval for Native XML Databases [J] . Su-Cheng Haw, Wai-Lin Chong Indian Journal of Science and Technology . 2016,第47期

机译：本地XML数据库的数据存储和查询检索的比较和基准测试
3. Report on the first Twente Data Management Workshop on XML Databases and Information Retrieval [J] . Djoerd Hiemstra, Vojkan Mihajlovic SIGMOD record . 2004,第4期

机译：首届Twente XML数据库和信息检索数据管理研讨会的报告
4. Retrieval of XML data - to support NLP applications [C] . Siddhartha Ghosh, Sameen S Fatima International Conference on Artificial Intelligence . 2007

机译：检索XML数据 - 支持NLP应用程序
5. Efficient implementation of update and retrieval query sequences over large data sets in a native XML database [D] . Mikhaylov, Alexander 2006

机译：在本机XML数据库中对大型数据集的更新和检索查询序列的有效实现
6. XML Data and Knowledge-Encoding Structure for a Web-Based and Mobile Antenatal Clinical Decision Support System: Development Study [O] . Ever Augusto Torres Silva, Sebastian Uribe, Jack Smith, 2020

机译：基于Web和移动产前临床决策支持系统的XML数据和知识编码结构：开发研究
7. A Hybrid Approach incorporating XML and NLP Techniques for Focused Information Retrieval in the Biomedical Domain [O] . Thilky Perera 2007

机译：一种混合方法，包括XML和NLP技术的聚焦信息在生物医学域中检索

Retrieval of XML data - to support NLP applications

摘要

著录项

相似文献

相关主题

期刊订阅