ZJn today’s world, the internet and computer technology enormously increased the amount of stored information and unprecedented expansion in the amount of unstructured data in the textual formats, we cannot use the data for any processing to extract useful information, due to the rapid growth of digital data, and Information explosion and availability has changed the nature of information centers. Hence, knowledge discovery and text data mining have attracted an empirical attention with an imminent need for turning such data into useful information, patterns and knowledge. Text mining has become an interesting area in business intelligence application, healthcare, media and research. Text Mining can be defined as a technique which is a process used to analyze text to extract interesting and meaningful information from new or previously unknown information, non-trivial patterns or knowledge of the unstructured text documents or from different resources for particular purposes. The text mining is an interdisciplinary research held utilizing techniques from computer science, computational linguistics, information retrieval, data mining and statistics. Existing toolkits for text mining have low extensibility, lack of availability of application programming interfaces and provide less support for interacting with computing environments. Hence, in this paper, we propose a text mining in R infrastructure or computing environment, it provides intelligent methods for Meta data management and operations on documents, such as preprocessing, data cloud formation, frequency graphs, text clustering and text classification. This paper presents how text mining techniques can be applied in R infrastructure and better utilizing infrastructure features than other text mining products such as dtSearch, SPSS, SAS Text Miner, RapidMiner, weka, etc.
展开▼