MACHINE LEARNING BASED INTEGRATION OF miRNA AND mRNA PROFILES COMBINED WITH FEATURE GROUPING AND RANKING
Abstract
It is very important to understand the development and progression mechanisms
of the diseases at the molecular level. Revealing the functional mechanisms that cause
the disease not only contributes to the molecular diagnosis of the diseases, but also
contributes to the development of the new treatment methods. Nowadays, due to the
advances in technology, more molecular data can be obtained at cheaper costs, unlike in
the past. Integrating these available data is essential to understand the molecular
mechanisms of the diseases, especially the ones having complex formation and
progression processes such as cancer.
In this thesis, to correctly classify cancer patients and cancer free patients, two
different bioinformatics tools (miRcorrNet and miRMUTINet) that integrate mRNA and
microRNA data (two types of -omic data at the molecular level) have been developed.
For 11 cancer types, mRNA and miRNA expression profiles of the samples were
downloaded from The Cancer Genome Atlas. These two data types were integrated
using both the Pearson Correlation Coefficient and the Mutual Information metrics. In
our experiments using 100-fold Monte Carlo Cross Validation, for both tools, 99% Area
Under the Curve score have been obtained. The developed tools have also been tested
using independent dataset. For biological validation purposes, for each cancer type,
functional enrichment analysis is conducted on the identified list of significant miRNAs
and genes. Additionally, for each cancer type, the identified mRNAs and miRNAs were
subject to literature validation and the findings were noteworthy