Detecting Information Structures in Texts

机译：检测文本中的信息结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The colossal growth of volatile online text data evokes the demand for automatic text analysis tools to identify worthwhile information. Documents, as well as text streams, can be structured beyond the concept of frequency distributions. Here we introduce a novel method that provides a relative measure for information value over a time series that is mapped by a dynamic trie structure. We adapt the concept of entropy for textual data and employ a compression-based estimation method. The algorithm can perform in a real-time scenario because of its linear complexity and since it is based on a dynamic history of predefined size. We show the suitability of our method with an experimental dataset and compare our results to an existing approach. Our results reveal structural properties of the texts and permit for deeper analysis of the presumably information peaks.

机译：挥发性在线文本数据的巨大增长唤起了对自动文本分析工具的需求来识别有价值的信息。文档以及文本流，可以构建超出频率分布的概念。在这里，我们介绍一种新的方法，该方法提供了通过动态Trie结构映射的时间序列上的信息值的相对度量。我们调整文本数据熵的概念，采用基于压缩的估计方法。由于其线性复杂性，该算法可以在实时方案中执行，并且由于它基于预定义大小的动态历史。我们展示了我们对实验数据集的方法的适用性，并将我们的结果与现有方法进行比较。我们的结果揭示了文本的结构性，并允许对可能的信息峰的更深入分析。

著录项

来源
《International conference on computer aided systems theory》|2013年||共8页
会议地点
作者
Thomas Bohne; Uwe M. Borghoff;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类机器辅助技术;
关键词
document analysis; information retrieval; entropy estimation; data compression; trie data structure;

机译：文档分析;信息检索;熵估计;数据压缩;TRIE数据结构;

相似文献

外文文献
中文文献
专利

1. Using Word Association to Detect Multitopic Structures in Text Documents [J] . Klahold Andre, Uhr Patrick, Ansari Fazel, Intelligent Systems, IEEE . 2014,第5期

机译：使用单词关联检测文本文档中的多主题结构
2. Combining Structured and Flat Features by a Composite Kernel to Detect Hedges Scope in Biological Texts [J] . ZHOU Huiwei, HUANG Degen, LI Xiaoyan, 电子学报：英文版 . 2011,第003期

机译：通过复合内核组合结构特征和平面特征以检测生物文本中的树篱范围
3. Detecting data records in semi-structured web sites based on text token clustering [J] . Xiaoying Gao, Le Phong Bao Vuong, Mengjie Zhang Integrated Computer-Aided Engineering . 2008,第4期

机译：基于文本令牌聚类的半结构式网站中的数据记录检测
4. Detecting Information Structures in Texts [C] . Thomas Bohne, Uwe M. Borghoff International conference on computer aided systems theory . 2013

机译：检测文本中的信息结构
5. The Relevance of Text Structure Strategy Instruction for Talmud Study: The Effects of Reading a Talmudic Passage with a Road-Map of its Text Structure. [D] . Jaffe, Yael. 2016

机译：塔尔木德研究的文本结构策略教学的相关性：阅读具有文本结构路线图的塔尔木德语段落的效果。
6. Detecting Social and Behavioral Determinants of Health with Structured and Free-Text Clinical Data [O] . Daniel J. Feller, Oliver J. Bear Dont Walk IV, Jason Zucker, 2020

机译：用结构化和自由文本临床数据检测健康的社会和行为决定因素
7. Novel Method to Detect Cardiac Device Infections by Integrating Electronic Medical Record Text with Structured Data in the Veterans Affairs Health System [O] . Hillary Mull, Kelly Stolzmann, Marlena Shin, 2020

机译：通过在退伍军人事务卫生系统中与结构数据集成电子医疗文本来检测心脏装置感染的新方法
8. Goal-Oriented Treatment of Text Structures in Text Planning [R] . Maier, E., Brown, M. 1990

机译：文本计划中文本结构的目标导向处理

Detecting Information Structures in Texts

摘要

著录项

相似文献

相关主题

期刊订阅