LASSO–MOGAT: a multi-omics graph attention framework for cancer classification
Abstract
The application of machine learning (ML) methods to analyze changes in gene expression patterns has recently emerged as a powerful approach in cancer research, enhancing our understanding of the molecular mechanisms underpinning cancer development and progression. Combining gene expression data with other types of omics data has been reported by numerous works to improve cancer classification outcomes. Despite these advances, effectively integrating high-dimensional multi-omics data and capturing the complex relationships across different biological layers remain challenging. This article introduces Least Absolute Shrinkage and Selection Operator–Multi-omics Gated Attention (LASSO–MOGAT), a novel graph-based deep learning framework that integrates messenger RNA, microRNA, and DNA methylation data to classify 31 cancer types. By utilizing differential expression analysis (DEG) with Linear Models for Microarray (LIMMA) and LASSO regression for feature selection and leveraging graph attention networks (GATs) to incorporate protein–protein interaction (PPI) networks, LASSO–MOGAT effectively captures intricate relationships within multi-omics data. Experimental validation using fivefold cross-validation demonstrates the method’s precision, reliability, and capacity to provide comprehensive insights into cancer molecular mechanisms. The computation of attention coefficients for the edges in the graph, facilitated by the proposed graph attention architecture based on PPIs, proved beneficial for identifying synergies in multi-omics data for cancer classification.