首页> 外文期刊>Computational statistics & data analysis >Dissimilarity measures and divisive clustering for symbolic multimodal-valued data
【24h】

Dissimilarity measures and divisive clustering for symbolic multimodal-valued data

机译:符号多峰值数据的相异性度量和分裂聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays, most government agencies and local authorities regularly and routinely collect a large amount of data from censuses and surveys and officially publish them for public purposes. The most frequently used form for the publication is as statistical tables and it is usually not possible to access the raw data for those tables due to privacy issues. Under these situations, we have to analyze data using only those aggregated tables. These tables typically have formats summarized by ordinal or nominal items. Tables for quantitative variables have histogram-valued formats and those for qualitative variables are represented by multimodal-valued types. Both are classes of the so-called symbolic data. In this study, we propose dissimilarity measures and a divisive clustering algorithm for symbolic multimodal-valued data. In order to split a partition efficiently at each stage, the algorithm extends the monothetic method for binary data. The proposed method is verified by simulation studies and applied to a work-related nonfatal injury and illness dataset.
机译:如今,大多数政府机构和地方政府定期并例行地从人口普查和调查中收集大量数据,并正式发布以用于公共目的。发布最常用的形式是统计表,由于隐私问题,通常无法访问这些表的原始数据。在这种情况下,我们仅需使用那些汇总表来分析数据。这些表通常具有按序号或标称项汇总的格式。定量变量的表具有直方图值格式,而定性变量的表则由多峰值类型表示。两者都是所谓的符号数据的类别。在这项研究中,我们为符号多模态值数据提出了相异性度量和分裂聚类算法。为了在每个阶段有效地分割分区,该算法扩展了二进制数据的单等方法。通过仿真研究验证了该方法的有效性,并将其应用于与工作相关的非致命伤害和疾病数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号