Active Subnetwork GA: A Two Stage Genetic Algorithm Approach to Active Subnetwork Search
Özet
Background: A group of interconnected genes in a protein-protein interaction network that contains most of the disease associated genes is called an active subnetwork. Active subnetwork search is an NP-hard problem. In the last decade, simulated annealing, greedy search, color coding, genetic algorithm, and mathematical programming based methods are proposed for this problem.
Method: In this study, we employed a novel genetic algorithm method for active subnetwork search problem. We used active node list chromosome representation, branch swapping crossover operator, multicombination of branches in crossover, mutation on duplicate individuals, pruning, and two stage genetic algorithm approach. The proposed method is tested on simulated datasets and Wellcome Trust Case Control Consortium rheumatoid arthritis genome-wide association study dataset. Our results are compared with the results of a simple genetic algorithm implementation and the results of the simulated annealing method that is proposed by Ideker et al. in their seminal paper.
Results and Conclusion: The comparative study demonstrates that our genetic algorithm approach outperforms the simple genetic algorithm implementation in all datasets and simulated annealing in all but one datasets in terms of obtained scores, although our method is slower. Functional enrichment results show that the presented approach can successfully extract high scoring subnetworks in simulated datasets and identify significant rheumatoid arthritis associated subnetworks in the real dataset. This method can be easily used on the datasets of other complex diseases to detect disease-related active subnetworks. Our implementation is freely available at https://www.ce.yildiz.edu.tr/personal/ozanoz/file/6611/ActSubGA