This paper describes a process flow of data preprocessing for computational experiment in analyzing protein interaction networks. The aim of this process flow is to fulfill the requirement of our graph-based algorithm's solution and to ensure that the representation of protein interaction network which consist of nodes and edges are accurate. The raw data is obtained from one of protein interaction database called Database of interacting Proteins (here-on known as DIP)). We describe briefly the DIP and since our algorithm requires the distance matrix as an input; a few important steps need to be taken to process the data. We limit this paper to data pre-preprocessing only and the algorithmic process is not discussed.
展开▼