首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Mining Strengths and Weaknesses of Cricket Players Using Short Text Commentary
【24h】

Mining Strengths and Weaknesses of Cricket Players Using Short Text Commentary

机译:利用短文评论挖掘板球运动员的优势和劣势

获取原文

摘要

Knowledge of strengths and weaknesses of players is the key for team selection and strategy planning in any team sport such as Cricket. Computationally, this problem is mostly unexplored. Existing methods focus only on aggregate and macroscopic statistics that ignore many details. The central idea of our paper is to mine strength and weakness rules using short text commentary data. This dataset is compact, semi-structured, accurate, and yet ignored by the machine learning community until now. We collect fine-grained information about each player from the short text commentary dataset and represent it using domain-specific features identified by us. We employ a dimensionality reduction method specific to discrete random variable case, namely correspondence analysis and construct semantic relation between bowler and batsman. This relation is plotted using biplots. Human readable strength and weakness rules are extracted from the biplots. We have performed experiments using a large dataset that describes over one million deliveries. We validate our extracted rules using both intrinsic and extrinsic validation.
机译:在板球等任何团队运动中,了解球员的优势和劣势是选择团队和制定战略计划的关键。从计算上讲,这个问题大部分是无法探索的。现有方法仅关注忽略许多细节的汇总和宏观统计。本文的中心思想是使用短文本评论数据挖掘优势和劣势规则。该数据集是紧凑的,半结构化的,准确的,但直到现在仍被机器学习社区所忽略。我们从短文本评论数据集中收集有关每个玩家的细粒度信息,并使用我们确定的特定领域功能将其表示出来。我们针对离散随机变量情况采用了降维方法,即对应分析,并构造了礼帽和蝙蝠侠之间的语义关系。此关系使用双线图绘制。从双线图中提取了人类可读的强项和弱项规则。我们已经使用了描述超过一百万次交付的大型数据集进行了实验。我们使用内部和外部验证来验证提取的规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号