The visualization and analysis of biological systems and data as networks has become a hallmark of modern biology. Relationships between biological entities; individuals, proteins, genes, RNAs etc., can all be better understood at one level or another when modelled as networks. As the size of these data has grown, so has the need for better tools and algorithms to deal with the complex issue of network visualization and analysis. We describe application and evaluation of a state-of-the-art graph layout method for use within biological workflows. BioLayout Express~(3D) is a powerful tool specifically designed for visualization, clustering, exploration, and analysis of very large networks in 2D and 3D space derived primarily from biological data [1]. In particular, its development has been driven by the need to analyse gene expression data, which typically consists of 10's of thousands of rows of quantitative gene expression measurements. First, the tool calculates a correlation matrix and then builds relationship networks, where nodes represent genes and edges expression similarities above a given r threshold. The resulting graphs can be very large e.g. 20-30,000 nodes, 5 million edges and possess a high degree of local structure with modules of co-expressed genes forming distinct cliques of high connectivity within the networks. BioLay-out has for a long time used a modified CPU/GPU parallelised version of the Fruchterman-Reingold (FR) algorithm for graph layout, and visualization of the graphs in 3D offers distinct advantages when viewing such complex graph structures. MCL clustering is used to divide the graph into coexpression clusters for further analysis. Whilst the existing FR implementation is capable and in many ways adequate at laying out these types of graph, the results for other graphs derived from biological data are less satisfactory, in particular DNA assembly graphs, which are inherently different in structure. The overlapping nature of DNA fragments when joined based on read-similarity form 'chain graphs'. Layout using the FR algorithm places nodes efficiently on a local scale, but a lack of global awareness results in a knot-like graph structure (Figure 1A) inhibiting the efficient visualisation of the overall assembly.
展开▼