首页>
外国专利>
INPUT FORMAT FOR ANALYZING BINARY TYPE DATA IN HADOOP MAP REDUCE FOR THE DISTRIBUTED PROCESSING OF NUTCH AND BINARY DATA ANALYZING METHOD USING THE SAME
INPUT FORMAT FOR ANALYZING BINARY TYPE DATA IN HADOOP MAP REDUCE FOR THE DISTRIBUTED PROCESSING OF NUTCH AND BINARY DATA ANALYZING METHOD USING THE SAME
PURPOSE: An input format for analyzing binary type data in HADOOP MAP REDUCE and binary data analyzing method using the same are provided to process fixed length binary data in a Hadoop environment without a converting operation of a data format, thereby requiring a small storage space and realizing a rapid processing speed.;CONSTITUTION: A length of a record of binary data is received. InputSplit is defined by setting up a boundary between previous InputSplit and its InputSplit with the closest value to a block beginning point among points becoming a multiple of the length of the record in a data block to be processed among data blocks stored in HDFS(Hadoop Distributed File System) as the beginning point. A record reader reads a whole area of the InpuSplit from the beginning point as much as the length of the record.;COPYRIGHT KIPO 2012
展开▼