【24h】

On Classification of Strings

机译:论字符串的分类

获取原文

摘要

In document filtering and content-based routing the aim is to transmit to the user only those documents that match the user's interests or profile. As filtering systems are deployed on the Internet, the number of users can become large. In this paper we focus on the question of how a large set of user profiles can be quickly searched in order to find those that are relevant to the document. In the abstract setting we assume that each profile is given as a regular expression, and, given a set of regular languages (the set of profiles), we want to determine for a given input string (the document) all those languages the input string belongs to. We analyze this problem, called the classification problem for a set of regular languages, and we show that in various important cases the problem can be solved by a small single deterministic finite automaton extended by conditional transitions.
机译:在文档过滤和基于内容的路由中,目的是仅向用户发送与用户的兴趣或配置文件匹配的那些文档。随着过滤系统部署在互联网上,用户数量会变大。在本文中,我们专注于如何快速搜索大量用户配置文件的问题,以便找到与文档相关的那些。在抽象设置中,我们假设每个配置文件都作为正则表达式给出,并且给定一组常规语言(配置文件集),我们想确定给定的输入字符串(文档)所有这些语言的输入字符串属于。我们分析了这一问题,称为一组常规语言的分类问题,我们认为在各种重要情况下,通过条件过渡延伸的小单个确定性有限自动机可以解决问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号