首页> 外文学位 >Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining.

【24h】

Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining.

机译：带有跨句推理的文本关联挖掘，基于结构的文档模型和多关系文本挖掘。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

With an exponential growth of published documents, text mining becomes a vital tool for an automated extraction of information and discovery of hidden information/knowledge. We begin this dissertation with an overview of text mining covering key definitions, pre-processing, feature selection, text representation and types of text mining. Then, we describe a fundamental text mining approach that we used for the development of a chromosome-21 database. Next, we present our three novel text mining techniques: (i) text association mining with cross-sentence inference, (ii) structure-based document model, and (iii) multi-relational text mining. Our techniques emphasize novel hypothesis generation, document representation and multi-relational discovery, respectively. In the text association mining with cross-sentence inference, statistical co-occurrences of terms and syntactic sentence structure analysis are initially used to find associations among key terms in documents. Subsequently, potential novel hypotheses are derived from the discovered associations. In a different way, the structure-based document model introduces two novel document representations for text documents that take into account not only term frequencies and patterns of term occurrences, but also the document's structural information. Based on the experimental results, our structure-based document models are superior to existing non-structure-based ones. Finally, the multi-relational text mining enhances a literature-based discovery method with multi-relational data mining and Inductive Logic Programming. It is aimed to discover relational knowledge in forms of frequent relational patterns and relational association rules from disjoint sets of literatures. These relational patterns and rules are complementary to the indirect connections found by existing literature-based discovery, and can be used for exploratory research.

机译：随着已发布文档的指数级增长，文本挖掘已成为自动提取信息和发现隐藏信息/知识的重要工具。本文从文本挖掘的概述开始，涵盖了关键定义，预处理，特征选择，文本表示和文本挖掘的类型。然后，我们描述了一种基本的文本挖掘方法，该方法用于开发21号染色体数据库。接下来，我们介绍三种新颖的文本挖掘技术：（i）具有交叉句子推理的文本关联挖掘，（ii）基于结构的文档模型和（iii）多关系文本挖掘。我们的技术分别强调新颖的假设生成，文档表示和多关系发现。在具有交叉句子推论的文本关联挖掘中，术语的统计共现和句法句子结构分析最初用于在文档中的关键术语之间查找关联。随后，从发现的关联中得出潜在的新颖假设。以不同的方式，基于结构的文档模型为文本文档引入了两种新颖的文档表示形式，它们不仅考虑了术语出现的频率和术语的出现方式，还考虑了文档的结构信息。根据实验结果，我们的基于结构的文档模型优于现有的非基于结构的文档模型。最后，多关系文本挖掘通过多关系数据挖掘和归纳逻辑编程增强了基于文献的发现方法。它旨在从不相交的文献集中以频繁的关系模式和关系关联规则的形式发现关系知识。这些关系模式和规则是对现有基于文献的发现所发现的间接联系的补充，可用于探索性研究。

著录项

作者
Thaicharoen, Supphachai.;
展开▼
作者单位

University of Colorado at Denver.;

展开▼
授予单位 University of Colorado at Denver.;
学科 Computer Science.
学位 Ph.D.
年度 2009
页码 131 p.
总页数 131
原文格式 PDF
正文语种 eng
中图分类石油、天然气工业;
关键词
入库时间 2022-08-17 11:37:57

相似文献

外文文献
中文文献
专利

1. A NOVEL MODEL FOR TEXT DOCUMENT REPRESENTATION: APPLICATION ON OPINION MINING DATASETS [J] . ASMAA MOUNTASSIR, HOUDA BENBRAHIM, ILHAM BERRADA Journal of computer science engineering and information technology research . 2014,第2期

机译：文本文档表示的新模型：在意见挖掘数据集上的应用
2. A NOVEL MODEL FOR TEXT DOCUMENT REPRESENTATION: APPLICATION ON OPINION MINING DATASETS [J] . ASMAA MOUNTASSIR, HOUDA BENBRAHIM, ILHAM BERRADA Journal of computer science engineering and information technology research . 2014,第2期

机译：文本文档表示的新模型：在意见挖掘数据集上的应用
3. A Semi-Structured Document Model for Text Mining [J] . YANG Jianwu, CHEN Xiaoou Journal of Computer Science & Technology . 2002,第5期

机译：用于文本挖掘的半结构文档模型
4. Analysis of Medical Documents with Text Mining and Association Rule Mining [C] . Ruth Reátegui, Sylvie Ratté International Conference on Information Technology and Systems . 2019

机译：文本挖掘与关联规则挖掘的医学文件分析
5. A semantic graph model for text representation and matching in document mining. [D] . Shaban, Khaled. 2006

机译：用于文档挖掘中文本表示和匹配的语义图模型。
6. Measuring the drafting alignment of patent documents using text mining [O] . Davit Khachatryan, Brigitte Muehlmann 2020

机译：使用文本挖掘测量专利文档的起草对齐
7. Auto-indexing Arabic texts based on association rule data mining. (c2015) [O] . Rouba G. Nasrallah 2015

机译：基于关联规则数据挖掘自动索引阿拉伯文本。（c2015）

Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining.

摘要

著录项

相似文献

相关主题

期刊订阅