首页> 外文会议>2010 Second Pacific-Asia Conference on Circuits,Communications and System >The study and design of the full-text search engine in electrical power industry based on Nutch
【24h】

The study and design of the full-text search engine in electrical power industry based on Nutch

机译:基于Nutch的电力工业全文搜索引擎的研究与设计。

获取原文

摘要

At present, the general search engine not only covers the small percentage of the particular field and the particular. subject, but also can not make sure the safety of the data which is indexed by the engine. Therefore, the paper developed a professional search engine with bright electrical power industry character based on the open-source search engine framework of Nutch. The system has a dictionary of electrical power industry, using an improved VSM algorithm to calculate the correlation of content which is captured by the crawler, and then filter the relevant parts. The indexed data is ordered by PageRank algorithm. The system also has an access control module, which can certificate the user's authority and classify the information. The system can improve the specialty of the information retrieval in some fields, and enhance the security of the search engine.
机译:目前,通用搜索引擎不仅覆盖了特定领域和特定领域的一小部分。主题,但也不能确保引擎索引的数据的安全性。因此,本文基于Nutch的开源搜索引擎框架,开发了具有光明电力行业特征的专业搜索引擎。该系统具有电力行业词典,使用改进的VSM算法来计算由爬虫捕获的内容的相关性,然后过滤相关部分。索引的数据通过PageRank算法排序。该系统还具有访问控制模块,该模块可以认证用户的权限并对信息进行分类。该系统可以提高某些领域信息检索的专业性,并提高搜索引擎的安全性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号