首页> 外国专利> APPARATUS AND METHOD FOR PARAGRAPH SEGMENTATION, AND SEARCH METHOD USING THE PARAGRAPH SEGMENTATION

APPARATUS AND METHOD FOR PARAGRAPH SEGMENTATION, AND SEARCH METHOD USING THE PARAGRAPH SEGMENTATION

机译:进行参数分段的装置和方法,以及使用参数分段的搜索方法

摘要

The search method using the short-splitting device and method, and short-circuit division method is disclosed. Subjects setting section sets at least one or more subjects in the training data representing the predetermined domain. Pattern extracting unit divides the training data as a short message through the language analysis and extracting a first pattern consisting of a predetermined vocabulary from a pair of short divided respectively. Pattern learning unit to generate a thematic pattern DB classified according to the set of the extracted first pattern subject. Paragraph divider input document by the division into short by the language analysis and extracts the subject of the extracted a second pattern consisting of a predetermined vocabulary pair from the short of the divided each most similar pattern and the second pattern from the theme pattern DB short the allocation of the subject. Thus, the e-mail or an encyclopedia article, the search efficiency as well as easy to understand the short-star subject of a specific domain by offering to select only the subject you want in the application system, including questions and answers, information retrieval, such as newspaper articles It can be improved.
机译:公开了一种使用短路分离装置的搜索方法和方法以及短路分割方法。受试者设置部分在表示预定领域的训练数据中设置至少一个或多个受试者。模式提取单元通过语言分析将训练数据划分为短消息,并分别从一对短划分对中提取由预定词汇组成的第一模式。模式学习单元生成根据提取的第一模式主题的集合分类的主题模式DB。通过语言分析将段落分割器输入文档分为短部分,并从所分割的每个最相似的模式的短部分中提取所提取的第二模式的主题,该第二模式包括预定的词汇对,而主题模式数据库中的第二模式则由短模式主题分配。因此,通过提供仅在应用系统中选择想要的主题(包括问题和答案,信息检索)的电子邮件或百科全书,搜索效率以及易于理解的特定领域的短星主题, ,例如报纸上的文章,可以加以改进。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号