Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. They consist of building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases (eg. RegulonDB, TRRD, Transfac, IPA), and using such a classifier to predict new unknown connections. The input to a binary supervised classifier consists normally of positive and negative examples, but usually the only available information are a partial set of gene regulations, i.e. positive examples, and unlabeled data which could include both positive and negative examples. A fundamental challenge is the choice of negative examples from such unlabeled data to make the classifier able to learn from data. We exploit the known topology of a gene network to select such negative examples and show whether such an assumption benefits the performance of a classifier.
展开▼