首页> 外文会议>Annual IST/SPIE Conference on Visualization and Analysis >Relating Interesting Quantitative Time Series Patterns with Text Events and Text Features
【24h】

Relating Interesting Quantitative Time Series Patterns with Text Events and Text Features

机译:将有趣的定量时间序列模式与文本事件和文本特征相关联

获取原文

摘要

In many application areas, the key to successful data analysis is the integrated analysis of heterogeneous data. One example is the financial domain, where time-dependent and highly frequent quantitative data (e.g., trading volume and price information) and textual data (e.g., economic and political news reports) need to be considered jointly. Data analysis tools need to support an integrated analysis, which allows studying the relationships between textual news documents and quantitative properties of the stock market price series. In this paper, we describe a workflow and tool that allows a flexible formation of hypotheses about text features and their combinations, which reflect quantitative phenomena observed in stock data. To support such an analysis, we combine the analysis steps of frequent quantitative and text-oriented data using an existing a-priori method. First, based on heuristics we extract interesting intervals and patterns in large time series data. The visual analysis supports the analyst in exploring parameter combinations and their results. The identified time series patterns are then input for the second analysis step, in which all identified intervals of interest are analyzed for frequent patterns co-occurring with financial news. An a-priori method supports the discovery of such sequential temporal patterns. Then, various text features like the degree of sentence nesting, noun phrase complexity, the vocabulary richness, etc. are extracted from the news to obtain meta patterns. Meta patterns are defined by a specific combination of text features which significantly differ from the text features of the remaining news data. Our approach combines a portfolio of visualization and analysis techniques, including time-, cluster- and sequence visualization and analysis functionality. We provide two case studies, showing the effectiveness of our combined quantitative and textual analysis work flow. The workflow can also be generalized to other application domains such as data analysis of smart grids, cyber physical systems or the security of critical infrastructure, where the data consists of a combination of quantitative and textual time series data.
机译:在许多应用领域中,成功进行数据分析的关键是对异构数据进行综合分析。一个例子是金融领域,其中需要结合时间和频繁的定量数据(例如交易量和价格信息)和文本数据(例如经济和政治新闻报道)。数据分析工具需要支持集成分析,该集成分析允许研究文本新闻文档与股市价格序列的定量属性之间的关系。在本文中,我们描述了一种工作流和工具,该工具和工具可以灵活地形成有关文本特征及其组合的假设,这些假设反映了在股票数据中观察到的定量现象。为了支持这种分析,我们使用现有的先验方法将频繁的定量和面向文本数据的分析步骤组合在一起。首先,基于启发式方法,我们在大型时间序列数据中提取有趣的区间和模式。可视分析支持分析人员探索参数组合及其结果。然后将识别出的时间序列模式输入到第二个分析步骤,在分析步骤中,分析所有识别出的关注区间,以了解与金融新闻同时发生的频繁模式。先验方法支持这种顺序时间模式的发现。然后,从新闻中提取各种文本特征,例如句子嵌套度,名词短语复杂度,词汇丰富度等,以获得元模式。元模式由文本特征的特定组合定义,这些特征与其余新闻数据的文本特征有很大不同。我们的方法结合了可视化和分析技术的组合,包括时间,聚类和序列的可视化和分析功能。我们提供了两个案例研究,显示了定量和文本分析相结合的工作流程的有效性。工作流还可以推广到其他应用领域,例如智能电网的数据分析,网络物理系统或关键基础设施的安全性,其中数据由定量和文本时间序列数据的组合组成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号