首页> 美国卫生研究院文献>PeerJ Computer Science >Event detection in finance using hierarchical clustering algorithms on news and tweets
【2h】

Event detection in finance using hierarchical clustering algorithms on news and tweets

机译:在新闻和推文中使用分层聚类算法的财务事件检测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In the current age of overwhelming information and massive production of textual data on the Web, Event Detection has become an increasingly important task in various application domains. Several research branches have been developed to tackle the problem from different perspectives, including Natural Language Processing and Big Data analysis, with the goal of providing valuable resources to support decision-making in a wide variety of fields. In this paper, we propose a real-time domain-specific clustering-based event-detection approach that integrates textual information coming, on one hand, from traditional newswires and, on the other hand, from microblogging platforms. The goal of the implemented pipeline is twofold: (i) providing insights to the user about the relevant events that are reported in the press on a daily basis; (ii) alerting the user about potentially important and impactful events, referred to as hot events, for some specific tasks or domains of interest. The algorithm identifies clusters of related news stories published by globally renowned press sources, which guarantee authoritative, noise-free information about current affairs; subsequently, the content extracted from microblogs is associated to the clusters in order to gain an assessment of the relevance of the event in the public opinion. To identify the events of a day d we create the lexicon by looking at news articles and stock data of previous days up to d−1 Although the approach can be extended to a variety of domains (e.g. politics, economy, sports), we hereby present a specific implementation in the financial sector. We validated our solution through a qualitative and quantitative evaluation, performed on the Dow Jones’ Data, News and Analytics dataset, on a stream of messages extracted from the microblogging platform Stocktwits, and on the Standard & Poor’s 500 index time-series. The experiments demonstrate the effectiveness of our proposal in extracting meaningful information from real-world events and in spotting hot events in the financial sphere. An added value of the evaluation is given by the visual inspection of a selected number of significant real-world events, starting from the Brexit Referendum and reaching until the recent outbreak of the Covid-19 pandemic in early 2020.
机译:在目前的压倒性信息的年龄和大量生产的网络上的文本数据中,事件检测已成为各种应用领域的越来越重要的任务。已经开发了几个研究分支,以解决不同的角度来解决问题,包括自然语言处理和大数据分析,目标是提供有价值的资源来支持各种领域的决策。在本文中,我们提出了一种基于域的特定于域的基于群集的事件检测方法,该检测方法将文本信息一方面从传统的新闻记到微博平台集成在一起。实施的管道的目标是双重:(i)向用户提供关于在每日报告的相关事件的见解; (ii)针对一些特定任务或域名提醒用户潜在的重要和有影响力的事件,转录为热门事件。该算法识别全球知名新闻源发布的相关新闻故事的集群,保证了有关当前事务的权威,无噪音信息;随后,从微博提取的内容与群集相关联,以便对公众意见中的事件的相关性进行评估。要确定一天的事件,我们通过观察前几天的新闻文章和股票数据来创建Lexicon,尽管该方法可以扩展到各个领域(例如政治,经济,体育),我们在此方面在金融部门呈现具​​体实施。我们通过定性和定量评估验证了我们的解决方案,在Dow Jones的数据,新闻和分析数据集上执行了从微博平台StoctTwits中提取的消息流,以及标准兼穷人的500索引时间序列。实验证明了我们提案在从现实世界事件中提取有意义的信息以及在金融领域发现热门事件的有效性。评估的附加值是通过从BREXIT公民投票中开始的选定数量的重要现实事件的视觉检查给出,直到2020年初的Covid-19大流行爆发到最近的爆发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号