Structuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD

机译：通过推导概率XML DTD构建领域特定的文本档案

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Domain-specific documents often share an inherent, though undocumented structure. This structure should be made explicit to facilitate efficient, structure-based search in archives as well as information integration. Inferring a semantically structured XML DTD for an archive and subsequently transforming its texts into XML documents is a promising method to reach these objectives. Based on the KDD-driven DIAs-DEM framework, we propose a new method to derive an archive-specific structured XML document type definition (DTD). Our approach utilizes association rule discovery and sequence mining techniques to structure a previously derived flat, i.e. unstructured DTD. We introduce the notion of a probabilistic DTD that is derived by discovering associations among and frequent sequences of XML tags, respectively.

机译：特定于域的文档通常共享一个固有的，但没有文档的结构。应该明确指定此结构，以方便在档案中进行高效的，基于结构的搜索以及信息集成。推断档案的语义结构化XML DTD，然后将其文本转换为XML文档是实现这些目标的一种有前途的方法。基于KDD驱动的DIAs-DEM框架，我们提出了一种新方法来导出特定于档案的结构化XML文档类型定义（DTD）。我们的方法利用关联规则发现和序列挖掘技术来构造先前导出的平面，即非结构化DTD。我们介绍了概率DTD的概念，它是通过分别发现XML标签之间的关联和频繁序列而得出的。

著录项

来源
《6th European Conference on Principles of Data Mining and Knowledge Discovery PKDD 2002, Aug 19-23, 2002, Helsinki, Finland》|2002年|p.461-474|共14页
会议地点 Helsinki(FI);Helsinki(FI)
作者
Karsten Winkler; Myra Spiliopoulou;
展开▼
作者单位

Leipzig Graduate School of Management (HHL), Department of E-Business Jahnallee 59, D-04109 Leipzig, Germany;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Research in the Archival Multiverse [J] . Katherine M. Wisser Library & Information Science Research . 2017,第3期

机译： Research 在Archival Multierse
2. Multinational Corporations and Organization Theory: Post Millennium Perspectives Research in the Sociology of Organizations [J] . Peter Zettinig, Jasper Hotho Journal of international management . 2017,第4期

机译：跨国公司公司和组织理论：千禧年透视图
3. Desigualdad Un análisis de la (in)felicidad colectiva (The spirit level. Why more equal societies almost always do better) [J] . Xavier Bartoll, Davide Malmusi Gaceta Sanitaria . 2011,第4期

机译： Desigualdad 合欢文化（精神层面。为什么平等的社会几乎总是做得更好）
4. Structuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD [C] . Karsten Winkler, Myra Spiliopoulou European Conference on Principles of Data Mining and Knowledge Discovery . 2002

机译：通过派生概率XML DTD来构造特定于域的文本归档
5. Latent Probabilistic Topic Discovery for Text Documents Incorporating Segment Structure and Word Order [D] . Jameel, Mohammad Shoaib 2014

机译：包含段结构和单词顺序的文本文档的潜在概率主题发现
6. Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies [O] . Jung-Wei Fan, Carol Friedman -1

机译：基于域特定术语导出用于生物医学的概率概要语法语法
7. Structuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD [O] . Karsten Winkler, Myra Spiliopoulou 2002

机译：通过派生概率XML DTD来构造特定于域的文本归档

Structuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅