...
首页> 外文期刊>Data technologies and applications >Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques
【24h】

Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques

机译:企业信用评级基于预测定性信息的命名转换使用文档向量化技术

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm's annual report in predicting corporate credit rating. Qualitative information represented by published reports or management interview has been known as an important source in addition to quantitative information represented by financial values in assigning corporate credit rating in practice. Nevertheless, prior studies have room for further research in that they rarely employed qualitative information in developing prediction model of corporate credit rating. Design/methodology/approach This study adopted three document vectorization methods, Bag-Of-Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the experiments, we used the corpus of Management's Discussion and Analysis (MD&A) section in 10-K financial reports as well as financial variables and corporate credit rating data. Findings Experimental results from a series of multi-class classification experiments show the predictive models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark models trained only by traditional financial variables. Originality/value This study proposed a new approach for corporate credit rating prediction by using qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also, this research adopted and compared three textual vectorization methods in the domain of corporate credit rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.
机译:目的本研究的目的调查定性的有效性从公司的年度报告中提取信息在预测企业信用评级。定性信息由出版报告或被称为管理面试除了定量一个重要来源由金融价值的信息分配在实践中企业信用评级。然而,之前的研究有进一步的发展空间他们很少使用定性研究信息发展的预测模型企业信用评级。设计/方法/方法本研究采用三个文档向量化方法,Bag-Of-Words(鞠躬),词向量(Word2Vec)和文档向量(Doc2Vec),变换一个非结构化文本数据到一个数值向量,接受这机器学习(ML)算法它作为输入。语料库管理的讨论和分析在10 - k(命名)部分财务报告随着金融变量和企业信贷评级数据。一系列的多层次分类实验显示了预测模型的训练金融变量和向量提取命名数据跑赢基准模型训练只有通过传统金融变量。创意/值本研究提出了一个新的企业信用评级方法的预测通过使用定性信息提取命名文件作为输入ML-based预测模型。三个文本向量化方法的领域企业信用评级预测和显示鞠躬主要表现优于Word2Vec和Doc2Vec。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号