Automatic Information Extraction in Semi-structured Official Journals

机译：半结构性官方期刊中的自动信息提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information extraction systems are used to extract only relevant text information in digital repositories. The current work proposes an automatic system to extract information in semi-structured official journals. In our approach, given an input document, a Machine Learning (ML) algorithm classifies the document’s fragments into class labels which correspond to the data fields to be extracted. The implemented system deployed different features sets and algorithms used in the classification of the fragments. The system was evaluated through experiments on a sample containing 22770 lines of the Pernambuco’s Official Journal. The experiments performed revealed, in general, good results in terms of precision, which ranged from 70.14% to 98.63% depending on the feature set and algorithm used in the classification of the fragments.

机译：信息提取系统用于仅在数字存储库中提取相关文本信息。目前的工作提出了一种自动系统，以提取半结构性官方期刊中的信息。在我们的方法中，给定输入文档，机器学习（ML）算法将文档的片段分类为类标签，该类标签对应于要提取的数据字段。实现的系统部署了在分类分类中的不同功能集和算法。该系统通过实验评估含有22770行的Pernambuco官方期刊的样本。一般来说，在精度方面进行了揭示的实验，这取决于分类中使用的特征集和算法的70.14％至98.63％。

著录项

来源
《Brazilian Symposium on Neural Networks》|2008年||共6页
会议地点
作者
Filho Valmir Mac; rio; Prud; ncio Ricardo B.C.; Carvalho Francisco A.T. de; Torres Leandro R.; J; nior Laerte Rodrigues; Lima Marcos G.; #x0FA; #x0EA; #x0E1;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP183-53;
关键词
Semi-Structured text; information extraction; official journals; text mining;

机译：半结构化文本;信息提取;官方期刊;文本挖掘;

相似文献

外文文献
中文文献
专利

1. Automatic Extraction of Objects and their Attributes from Semi-Structured Web Tables for E-commerce Tasks [J] . Yerzhan Baiburin, Aliya Nugumanova Indian Journal of Science and Technology . 2015,第30期

机译：从半结构化Web表中自动提取对象及其属性以完成电子商务任务
2. Automatic information extraction from semi-structured Web pages by pattern discovery [J] . Chia-Hui Chang, Chun-Nan Hsu, Shao-Cheng Lui Decision support systems . 2003,第1期

机译：通过模式发现从半结构化网页中自动提取信息
3. Understanding quotation extraction and attribution: towards automatic extraction of public figure's statements for journalism in Indonesia [J] . Yohanes Sigit Purnomo W.P., Yogan Jaya Kumar, Nur Zareen Zulkarnain Library Review . 2021,第6a7期

机译：了解报价提取和归属：朝着自动提取公共人物在印度尼西亚新闻的陈述
4. Automatic Information Extraction in Semi-structured Official Journals [C] . Filho Valmir Mac, rio, Prud, Brazilian Symposium on Neural Networks . 2008

机译：半结构性官方期刊中的自动信息提取
5. Entity information extraction using structured and semi-structured resources. [D] . Sil, Avirup. 2014

机译：使用结构化和半结构化资源提取实体信息。
6. ExaCT: automatic extraction of clinical trial characteristics from journal publications [O] . Svetlana Kiritchenko, Berry de Bruijn, Simona Carini, 2010

机译：ExaCT：从期刊出版物中自动提取临床试验特征
7. Automatic Information Extraction in Semi-Structured Official Journals [O] . Valmir Macário Filho, Ricardo B. C. Prudêncio, Francisco A. T. De Carvalho, 2011

机译：半结构化官方期刊中的自动信息提取

Automatic Information Extraction in Semi-structured Official Journals

摘要

著录项

相似文献

相关主题

期刊订阅