首页> 外文会议>International Conference Information Visualisation >Using Otsu's Threshold Selection Method for Eliminating Terms in Vector Space Model Computation
【24h】

Using Otsu's Threshold Selection Method for Eliminating Terms in Vector Space Model Computation

机译:使用Otsu的阈值选择方法消除向量空间模型计算中的项

获取原文

摘要

Visualization techniques have proved to be valuable tools to support textual data exploration. Dimensionality reduction techniques have been widely used to produce visual representation of document collections. Focusing on multidimensional projection techniques, good visual results are produced depending on how representative terms to discriminate the documents are chosen to compose the vector space model (VSM). To define a good VSM it is necessary to apply filters during the preprocessing in order to eliminate terms using their frequency. For that, the user must evaluate the term frequency histogram based on his/her expertise in the text subject and decide the threshold value for frequency cut. Usually it is a trial and error approach that requires the user to verify the quality of visual representation after each trial. In this paper, we propose an automatic approach that applies the Otsu's Threshold Selection Method for computing a threshold using a term frequency histogram. We conducted experiments that have shown our approach generates visual representations as good as those generated with a threshold obtained by trial and error approach. The contribution of our approach is that users with non expertise are able to generate good visual representations and the time to get a good threshold is decreased.
机译:事实证明,可视化技术是支持文本数据浏览的有价值的工具。降维技术已广泛用于生成文档集合的可视表示。着眼于多维投影技术,根据如何选择具有代表性的术语来区别文档来产生矢量空间模型(VSM),可以产生良好的视觉效果。为了定义一个好的VSM,有必要在预处理过程中应用滤波器,以消除使用其频率的项。为此,用户必须根据他/她在文本主题方面的专业知识来评估术语频率直方图,并确定频率削减的阈值。通常,这是一种反复试验的方法,需要用户在每次试验后验证视觉表示的质量。在本文中,我们提出了一种自动方法,该方法将Otsu的阈值选择方法应用于使用项频率直方图来计算阈值。我们进行的实验表明,我们的方法所产生的视觉表示效果与通过反复试验方法所获得的阈值所产生的效果一样好。我们的方法的贡献在于,不具有专业知识的用户能够生成良好的视觉表示,并且缩短了获得良好阈值的时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号