首页>
外国专利>
System and method for tracking flow of data during map-reduce job execution in hadoop
System and method for tracking flow of data during map-reduce job execution in hadoop
展开▼
机译:在hadoop中执行map-reduce作业时跟踪数据流的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed is a method and system for tracking flow of data in a distributed file system. The system may receive a MapReduce job application at first. Subsequently, the system may identify relevant locations of code of the MapReduce job application. The system may instrument the code by adding one or more program statements at the relevant locations of the code. The instrumented code may be executed at each node to process big data. The system may receive processing details of the big data from each node. The system may aggregate the processing details to generate a hierarchical dataflow map to be used for tracking flow of the data in the distributed file system.
展开▼