首页> 外国专利> CONTEXTUAL INTERESTINGNESS RANKING OF DOCUMENTS FOR DUE DILIGENCE IN THE BANKING INDUSTRY WITH TOPICALITY GROUPING

CONTEXTUAL INTERESTINGNESS RANKING OF DOCUMENTS FOR DUE DILIGENCE IN THE BANKING INDUSTRY WITH TOPICALITY GROUPING

机译:带有分组性的银行业尽职调查文档的上下文趣味性排名

摘要

Documents needing to be analyzed for various reasons, such as financial crimes, are ranked by examining the topicality and sentiment present in each document for a given subject of interest. In one approach a given document is classified to determine its category, and entity recognition is used to identify the subject of interest. Passages from the document that relate to the entity are grouped and analyzed for sentiment to generate a sentiment score. Documents are then ranked based on the sentiment scores. In another approach, a classification probability score is computed for each passage representing a likelihood that the passage relates to a category of interest, and the document is ranked based on the sentiment scores and the classification probability scores. The category classification uses an ensemble of natural language text classifiers. One of the classifiers is a naïve Bayes classifier with feature vectors generated using Word2Vec modeling.
机译:通过检查针对给定感兴趣主题的每个文档中存在的时事性和情感,对由于各种原因(例如金融犯罪)而需要分析的文档进行排名。在一种方法中,给定文档被分类以确定其类别,并且实体识别被用于识别感兴趣的主题。将文档中与实体相关的段落进行分组并进行情感分析,以生成情感分数。然后根据情感评分对文档进行排名。在另一种方法中,针对每个段落计算代表该段落与感兴趣的类别有关的可能性的分类概率分数,并且基于情感分数和分类概率分数对文档进行排名。类别分类使用自然语言文本分类器的集合。分类器之一是朴素的贝叶斯分类器,其具有使用Word2Vec建模生成的特征向量。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号