The financial status and risk information of listed companies—the cornerstone of the securities market—is the focus of many investors and researchers, who usually conduct their researches based on the risk information invoked in annual reports of listed companies. The current methods are only based on word segmentation and frequency statistics, although a single word cannot capture the meaning of text and topics. This paper adapts the phrase extraction technology based on multi-factor fitting into the risk assessment of 76 listed companies in the environmental protection industry in Shenzhen and Shanghai stock markets. Finally, we use jQCloud to visualize the theme phrase.%上市公司作为证券市场的基石,其财务状况与风险信息是众多投资者与研究人员的关注焦点,而上市公司年报中的风险信息披露字段因其权威性与公开性成为研究者评估上市公司风险的研究依据.目前针对风险信息披露字段内容的研究仅停留在基于分词与词频统计的风险分析层面,而单个的词并不能很好地揭示不同风险主题的具体表现和语义内容.本文采用基于多因素拟合的风险短语识别技术,对沪深两市环保行业76家上市公司年报中"风险因素"的文字描述字段进行处理,得到环保行业不同风险主题文本中的主题短语,最后使用jQCloud词云图对风险主题短语进行可视化展示.
展开▼