首页> 外国专利> Method and system for off-line detection of textual topical changes and topic identification via likelihood based methods for improved language modeling

Method and system for off-line detection of textual topical changes and topic identification via likelihood based methods for improved language modeling

机译:通过基于似然度的方法离线检测文本主题变化和主题标识的方法和系统,用于改进语言建模

摘要

A system (and method) for off-line detection of textual topical changes includes at least one central processing unit (CPU), at least one memory coupled to the at least one CPU, a network connectable to the at least one CPU, and a database, stored on the at least one memory, containing a plurality of textual data set of topics. The CPU executes first and second processes in first and second directions, respectively, for extracting a segment having a predetermined size from a text, computing likelihood scores of a text in the segment for each topic, computing likelihood ratios, comparing them to a threshold, and defining whether there is a change point at the current last word in a window.
机译:一种用于离线检测文本主题变化的系统(和方法),包括至少一个中央处理单元(CPU),耦合到所述至少一个CPU的至少一个存储器,可连接到所述至少一个CPU的网络,以及数据库,存储在至少一个存储器上,包含多个主题的文本数据集。 CPU分别在第一和第二方向上执行第一和第二处理,以从文本中提取具有预定大小的片段,为每个主题计算该片段中文本的似然分数,计算似然比,将其与阈值进行比较,并定义窗口中当前最后一个单词是否有更改点。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号