首页> 外国专利> Text-based automatic content classification and grouping

Text-based automatic content classification and grouping

机译:基于文本的自动内容分类和分组

摘要

A closed-caption [101], is passed to a natural language analysis tool [102]. Noun phrases and proper nouns in closed-captions are extracted and saved in a file [103]. The noun phrase file is passed to a word-code translation tool [104] and each different word is assigned a unique code from a dictionary. The output [105] of the word-code translation tool [104] provides source data for story classification [110] and grouping [114]. For story classification [110], training [107] and testing [111] examples are generated by another tool [106]. A story classification knowledge network [109] is generated from training examples [107] input to the training module and modified thereafter for each new story. Class prediction [112] and knowledge base modification can be realized interactively on a news organizer platform. Relevant story grouping [114] takes a story location [105] and corresponding story grouping files [113] and determines a group [115] for the new story.
机译:隐藏字幕[101]被传递到自然语言分析工具[102]。提取隐藏式字幕中的名词短语和专有名词并将其保存在文件中[103]。名词短语文件被传递到单词代码翻译工具[104],并且为每个不同的单词分配来自词典的唯一代码。单词代码翻译工具[104]的输出[105]提供用于故事分类[110]和分组[114]的源数据。对于故事分类[110],训练[107]和测试[111]示例由另一个工具[106]生成。从输入到训练模块的训练示例[107]中生成故事分类知识网络[109],然后针对每个新故事对其进行修改。类别预测[112]和知识库修改可以在新闻组织者平台上交互实现。相关故事分组[114]获取故事位置[105]和相应的故事分组文件[113],并确定新故事的组[115]。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号