首页> 外国专利> Domain concept discovery and clustering using word embedding in dialogue design

Domain concept discovery and clustering using word embedding in dialogue design

机译:域概念发现和聚类使用词嵌入在对话设计中

摘要

A system and method performs automated domain concept discovery and clustering using word embeddings by receiving a set of documents for natural language processing for a domain, representing a plurality of entries in the set of documents as continuous vectors in a high dimensional continuous space, applying a clustering algorithm based on a mutual information optimization criterion to form a set of clusters, associating each entry of the plurality of entries with each cluster in the set of clusters through formalizing an evidence based model of each cluster given each entry, calculating a mutual information metric between each entry and each cluster using the evidence based model, and identifying a nominal center of each cluster by maximizing the mutual information.
机译:系统和方法通过接收用于域的自然语言处理的一组文档来执行自动域概念发现和群集,用于域的自然语言处理,将该组文档中的多个条目表示为高维连续空间中的连续向量,应用a基于互信息优化标准的聚类算法形成一组簇,通过在每个条目给出每个条目的每个群集的基于核心的基于捕获的基于证据,将多个条目的每个条目与每个群集相关联,计算相互信息度量每个条目和每个群集使用基于证据的模型,并通过最大化相互信息来识别每个群集的标称中心。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号