首页> 外文期刊>Information technology & people >ACRank: a multi-evidence text-mining model for alliance discovery from news articles
【24h】

ACRank: a multi-evidence text-mining model for alliance discovery from news articles

机译:Acrank:来自新闻文章的联盟发现的多证据文本挖掘模型

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose Strategic alliances among organizations are some of the central drivers of innovation and economic growth. However, the discovery of alliances has relied on pure manual search and has limited scope. This paper proposes a text-mining framework, ACRank, that automatically extracts alliances from news articles. ACRank aims to provide human analysts with a higher coverage of strategic alliances compared to existing databases, yet maintain a reasonable extraction precision. It has the potential to discover alliances involving less well-known companies, a situation often neglected by commercial databases. Design/methodology/approach The proposed framework is a systematic process of alliance extraction and validation using natural language processing techniques and alliance domain knowledge. The process integrates news article search, entity extraction, and syntactic and semantic linguistic parsing techniques. In particular, Alliance Discovery Template (ADT) identifies a number of linguistic templates expanded from expert domain knowledge and extract potential alliances at sentence-level. Alliance Confidence Ranking (ACRank)further validates each unique alliance based on multiple features at document-level. The framework is designed to deal with extremely skewed, noisy data from news articles. Findings In evaluating the performance of ACRank on a gold standard data set of IBM alliances (2006-2008) showed that: Sentence-level ADT-based extraction achieved 78.1% recall and 44.7% precision and eliminated over 99% of the noise in news articles. ACRank further improved precision to 97% with the top20% of extracted alliance instances. Further comparison with Thomson Reuters SDC database showed that SDC covered less than 20% of total alliances, while ACRank covered 67%. When applying ACRank to Dow 30 company news articles, ACRank is estimated to achieve a recall between 0.48 and 0.95, and only 15% of the alliances appeared in SDC. Originality/value The research framework proposed in this paper indicates a promising direction of building a comprehensive alliance database using automatic approaches. It adds value to academic studies and business analyses that require in-depth knowledge of strategic alliances. It also encourages other innovative studies that use text mining and data analytics to study business relations.
机译:组织之间的目的战略联盟是创新和经济增长的一些中央驱动因素。然而,联盟的发现依赖于纯手动搜索并具有有限的范围。本文提出了一种文本挖掘框架,繁体,自动从新闻文章中提取联盟。 Acrank旨在提供与现有数据库相比,以更高的战略联盟覆盖的人类分析师,但保持合理的提取精度。它有可能发现涉及较少知名公司的联盟,商业数据库通常忽视的情况。设计/方法/方法建议的框架是使用自然语言处理技术和联盟域知识的联盟提取和验证的系统过程。该过程集成了新闻文章搜索,实体提取和句法和语义语言解S技术。特别是,联盟发现模板(ADT)识别了许多语言模板,从专家领域知识扩展并在句子级提取潜在联盟。 Alliance置信度排名(acrank)根据文档级别的多个功能进一步验证了每个唯一的联盟。该框架旨在处理来自新闻文章的极其偏斜,嘈杂的数据。在评估IBM联盟(2006-2008)(2006-2008)的黄金标准数据集上的表现表明:基于句子级的ADT的提取达到78.1%的召回和44.7%的精度,并在新闻文章中消除了超过99%的噪声。随着920%的提取联盟实例,Accank进一步提高了97%。与Thomson Reuters SDC数据库的进一步比较显示,SDC占总联盟的占总联盟的20%,而繁体涵盖67%。在申请扫描旺今30家公司新闻文章时,估计遗传率达到0.48%和0.95之间的召回,只有15%的联盟出现在SDC中。原创性/值本文提出的研究框架表示使用自动方法构建全面联盟数据库的有希望的方向。它为学术研究和业务分析增加了价值,需要深入了解战略联盟。它还鼓励其他使用文本挖掘和数据分析来研究业务关系的创新研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号