Information extraction in statistics indicator tables using rule generalizations and ontology

机译：使用规则概括和本体在统计指标表中提取信息

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The main problem of rule-based information extraction technique is that the extraction rules tend to be specifically designed for specific information or document structure; hence it cannot be directly used in another without some proper modifications. Semi-structured documents like tables present another challenge to information extraction; since there are no standards on how to design it, the structure of the tables can be varying. Statistics indicator is a source of information that use tables as a means of data presentation. Statistics indicators also have a relationship concept that must be carefully identified and extracted. Generalization rules attempt to reduce effort in the extraction rule modification process by creating extraction rules in general terms. Combined with ontology, the rules can also extract the relationship between indicators. The output of this information extraction system is a database that keeps not only the data itself but also the relationship concept between indicators.

机译：基于规则的信息提取技术的主要问题是，提取规则倾向于针对特定的信息或文档结构进行专门设计。因此，如果不做一些适当的修改，就不能直接将其用于其他应用程序中。诸如表格之类的半结构化文档对信息提取提出了另一个挑战。由于没有关于如何设计的标准，因此表的结构可能会有所不同。统计指标是使用表作为数据表示手段的信息来源。统计指标还具有必须仔细识别和提取的关系概念。泛化规则试图通过一般性地创建提取规则来减少提取规则修改过程中的工作量。结合本体，规则还可以提取指标之间的关系。该信息提取系统的输出是一个数据库，该数据库不仅保留数据本身，还保留指标之间的关系概念。

著录项

来源
《International Conference on Information Technology Systems and Innovation》|2016年|1-6|共6页
会议地点
作者
Muhammad Rio Bastian; Ayu Purwarianti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data mining; Companies; Ontologies; Information retrieval; Databases; Layout; Context;

机译：数据挖掘;公司;本体;信息检索;数据库;布局;上下文;

相似文献

外文文献
中文文献
专利

1. Assessment of metals bioavailability to vegetables under field conditions using DGT, single extractions and multivariate statistics [J] . Marin Senila, Erika Andrea Levei, Lacrimioara Ramona Senila Chemistry central journal . 2012,第1期

机译：使用DGT，单次提取和多元统计评估田间条件下蔬菜中金属的生物利用度
2. Assessment of metals bioavailability to vegetables under field conditions using DGT, single extractions and multivariate statistics [J] . Marin Senila, Erika Andrea Levei, Lacrimioara Ramona Senila Chemistry central journal . 2012,第1期

机译：使用DGT，单次提取和多元统计评估田间条件下蔬菜中金属的生物利用度
3. A New Approach to the Extraction of ANN Rules and to Their Generalization Capacity Through GP [J] . Juan R. Rabunal, Julian Dorado, Alejandro Pazos, Neural computation . 2004,第7期

机译：GP提取ANN规则及其泛化能力的新方法。
4. Information extraction in statistics indicator tables using rule generalizations and ontology [C] . Muhammad Rio Bastian, Ayu Purwarianti International Conference on Information Technology Systems and Innovation . 2016

机译：使用规则概括和本体中的统计指示表中的信息提取
5. Heuristic rules for extraction of ontology from Web pages in WebOntEx. [D] . Jain, Bhanu Chaturvedi. 2000

机译：从WebOntEx中的网页提取本体的启发式规则。
6. Assessment of metals bioavailability to vegetables under field conditions using DGT single extractions and multivariate statistics [O] . Marin Senila, Erika Andrea Levei, Lacrimioara Ramona Senila 2012

机译：使用DGT单次提取和多元统计评估田间条件下蔬菜中金属的生物利用度
7. Ontology-driven Rule Generalization and Categorization for Market Data [O] . 2008

机译：本体驱动的规则泛化与市场数据分类

Information extraction in statistics indicator tables using rule generalizations and ontology

摘要

著录项

相似文献

相关主题

期刊订阅