The number of users and the amount of information available has exploded since the advent of the World Wide Web (WWW). Most of Web users use various search engines to get specific information. A key factor in the success of Web search engines are their ability to rapidly find good quality results to the queries that are based on specific terms. This paper aims at retrieving more relevant documents from a huge corpus based on the required information. We propose a text mining framework that consists of four distinct stages: 1. Text preprocessing 2. Dimesionality Reduction using Latent Semantic Indexing 3. Clustering based on Hybrid combination of Particle Swarm Optimization (PSO) and k-means Algorithm 4. Information Retrieval Process using Simulated Annealing (SA). This framework provides more relevant documents to the user and reduces the irrelevant documents.
展开▼