【24h】

Multi-representation approach to text regression of financial risks

机译:多元表示法对金融风险进行文本回归

获取原文
获取原文并翻译 | 示例

摘要

Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.
机译:已经提出了用于文本特征提取的不同方法,该方法从简单的字数统计特征开始,并从捕获分布语义的更深层表示开始。在最近的出版物中,词嵌入方法已被成功地用作大量NLP任务的表示基础,例如文本分类,语音标记和许多其他任务。在本文中,我们探索了在一个回归任务中同时使用多个文本表示形式的机会,以便利用语义丰富的嵌入来开发传统的单词袋方法。我们调查这种多代表方法在财务风险预测问题上的性能。美国贸易公司提供的公开提供的10-K报告将用作预测明年股价波动的基础。我们的研究表明,基于单一表示的模型可实现的性能可与之前发布的风险预测结果相媲美,具有多个表示的模型可受益于补充信息,并且优于基线模型和单一表示模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号