首页> 外文期刊>Expert Systems with Application >A methodology for the resolution of cashtag collisions on Twitter - A natural language processing & data fusion approach
【24h】

A methodology for the resolution of cashtag collisions on Twitter - A natural language processing & data fusion approach

机译:解决Twitter上现金标签冲突的方法-自然语言处理和数据融合方法

获取原文
获取原文并翻译 | 示例

摘要

Investors utilise social media such as Twitter as a means of sharing news surrounding financials stocks listed on international stock exchanges. Company ticker symbols are used to uniquely identify companies listed on stock exchanges and can be embedded within tweets to create clickable hyperlinks referred to as cashtags, allowing investors to associate their tweets with specific companies. The main limitation is that identical ticker symbols are present on exchanges all over the world, and when searching for such cashtags on Twitter, a stream of tweets is returned which match any company in which the cashtag refers to - we refer to this as a cashtag collision. The presence of colliding cashtags could sow confusion for investors seeking news regarding a specific company. A resolution to this issue would benefit investors who rely on the speediness of tweets for financial information, saving them precious time. We propose a methodology to resolve this problem which combines Natural Language Processing and Data Fusion to construct company-specific corpora to aid in the detection and resolution of colliding cashtags, so that tweets can be classified as being related to a specific stock exchange or not. Supervised machine learning classifiers are trained twice on each tweet - once on a count vectorisation of the tweet text, and again with the assistance of features contained in the company-specific corpora. We validate the cashtag collision methodology by carrying out an experiment involving companies listed on the London Stock Exchange. Results show that several machine learning classifiers benefit from the use of the custom corpora, yielding higher classification accuracy in the prediction and resolution of colliding cashtags. (C) 2019 The Authors. Published by Elsevier Ltd.
机译:投资者利用Twitter之类的社交媒体作为分享在国际证券交易所上市的金融股票新闻的手段。公司股票代码用于唯一地标识在证券交易所上市的公司,并且可以嵌入到推文中以创建可点击的超链接,称为现金标签,从而使投资者可以将其推文与特定公司相关联。主要限制是,全世界的交易所上都使用相同的股票代码,并且在Twitter上搜索此类现金标签时,会返回一条与现金标签所涉及的任何公司相匹配的推文流-我们将其称为现金标签碰撞。现金标签冲突的存在可能会使寻求特定公司新闻的投资者感到困惑。解决该问题的方法将使那些依赖推文快速获取财务信息的投资者受益,从而节省了宝贵的时间。我们提出了一种解决此问题的方法,该方法结合了自然语言处理和数据融合来构建公司特定的语料库,以帮助检测和解决冲突的现金标签,从而可以将推文分类为与特定证券交易所有关。监督的机器学习分类器在每条推文上都进行了两次训练-一次是在推文文本的计数向量化上,又是在公司特定语料库中包含的功能的辅助下进行的。我们通过对伦敦证券交易所上市公司进行的实验来验证现金标签冲突方法。结果表明,一些机器学习分类器得益于自定义语料库的使用,在碰撞现金标签的预测和解析中产生了更高的分类精度。 (C)2019作者。由Elsevier Ltd.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号