As an important branch of modern information retrieval technology, full-text search is not only an important tool for dealing with unstructured data, but also one of the mainstream technology of search engines.This paper starts from studying the working principles and process of search engine model in depth, and talks about Lucene's architecture with privious knowledge and how to use Lucene.Finally, mainly for some basic algorithms of Chinese segmentation and relevance ranking, we set up a Lucene-based full-text document retrieval system by applying these technologies.%全文检索作为现代信息检索技术的一个重要分支,不仅是处理非结构化数据的重要工具,也是搜索引擎的主流技术之一.本文首先从全文搜索引擎模型入手,对其基本工作原理和流程进行深入研究,并结合这些知识研究开源检索引擎包Lucene的架构原理及其开发应用方法.然后介绍中文分词和基本算法及Lucene的相关技术.
展开▼