首页> 外国专利> Selectively deleting clusters of conceptually related words from a generative model for text

Selectively deleting clusters of conceptually related words from a generative model for text

机译:从文本生成模型中选择性删除概念上相关的单词簇

摘要

One embodiment of the present invention provides a system that selectively deletes clusters of conceptually-related words from a probabilistic generative model for textual documents. During operation, the system receives a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words. Nodes in the current model are coupled together by weighted links, so that if an incoming link from a node that has fired causes a cluster node to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link. Next, the system processes a given cluster node in the current model for possible deletion. This involves determining a number of outgoing links from the given cluster node to terminal nodes or cluster nodes in the current model. If the determined number of outgoing links is less than a minimum value, or if the frequency with which the given cluster node fires is less than a minimum frequency, the system deletes the given cluster node from the current model.
机译:本发明的一个实施例提供了一种系统,该系统从用于文本文档的概率生成模型中选择性地删除概念上相关的单词的簇。在操作期间,系统接收当前模型,该模型包含代表单词的随机变量的终端节点,并包含一个或多个代表概念上相关的单词的簇的簇节点。当前模型中的节点通过加权链接耦合在一起,因此,如果已触发的节点的传入链接导致群集节点以与传入链接的权重成比例的概率触发,则群集节点到另一个节点使另一个节点以与传出链路的权重成比例的概率触发。接下来,系统处理当前模型中的给定群集节点以进行可能的删除。这涉及确定从给定群集节点到当前模型中的终端节点或群集节点的多个出站链接。如果确定的传出链接数小于最小值,或者给定群集节点激发的频率小于最小频率,则系统将从当前模型中删除给定群集节点。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号