首页> 外文期刊>Journal of computer and system sciences >Optimal string clustering based on a Laplace-like mixture and EM algorithm on a set of strings
【24h】

Optimal string clustering based on a Laplace-like mixture and EM algorithm on a set of strings

机译:基于LAPLACS的混合和EM算法在一组字符串上最佳串集群

获取原文
获取原文并翻译 | 示例
           

摘要

In this study, we address the problem of clustering string data in an unsupervised manner by developing a theory of a mixture model and an EM algorithm for strings based on probability theory on a topological monoid of strings developed in our previous studies. We begin with introducing a parametric probability distribution on a set of strings, which has location and dispersion parameters of a string and positive real number. We develop an iteration algorithm for estimating the parameters of the mixture model of the distributions introduced and demonstrate that our algorithm converges to the EM algorithm, which cannot be explicitly written for this mixture model, with probability one and strongly consistently estimates its parameters as the numbers of observed strings and iterations increase. We finally derive a procedure for unsupervised string clustering that is asymptotically optimal in the sense that the posterior probability of making correct classifications is maximized.
机译:在这项研究中,我们通过在我们以前的研究中开发的拓扑长弦概率理论的概率理论,通过开发混合模型和EM算法以无监督方式解决串数据的问题。我们首先在一组字符串上引入参数概率分布,其具有字符串和正物数的位置和色散参数。我们开发了一种迭代算法,用于估计所引入的分布的混合模型的参数,并证明我们的算法会聚到EM算法,这不能明确地为该混合模型编写,并且强烈始终如一地估计其参数作为数字观察到的字符串和迭代增加。我们最终导出了一个无监督的字符串聚类的过程,即在制作正确分类的后验概率最大化的意义上是渐近的最佳选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号