Comprehensive Network Analysis of Lung Cancer Biomarkers Identifying Key Genes Through RNA-Seq Data and PPI Networks
Abstract
Tis study addresses the pressing need for improved lung cancer diagnosis and treatment by leveraging computational methods
and omics data analysis. Lung cancer remains a leading cause of cancer-related deaths globally, highlighting the urgency for more
efective diagnostic and therapeutic approaches. Current diagnostic methods, such as imaging and biopsies, sufer from limitations in sensitivity, specifcity, and accessibility, often due to factors such as poor data quality, small sample sizes, and variability
in data sources. Tese limitations highlight the necessity for the development of advanced noninvasive techniques. Computational
methods utilizing omics data have shown promise in overcoming these challenges by comprehensively understanding the
molecular pathways involved in lung cancer. We propose a novel approach that utilizes RNA-Seq data and employs LASSO
regression with attention mechanisms to identify lung cancer biomarkers. Our results demonstrate the efectiveness of this
approach in identifying potential biomarkers for lung cancer, including well-known genes such as TP53, EGFR, KRAS, ALK, and
PIK3CA, validating the model’s ability to uncover key genes associated with lung cancer development and progression. Gene
Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses revealed signifcant
associations of the identifed genes with critical biological processes and pathways, including protein synthesis, folding, cell
adhesion, gene regulation, and immune responses. Te PPI network analysis, constructed using the STRING database and
Cytoscape application, highlighted a highly interconnected interaction landscape, with central hub genes playing pivotal roles in
lung cancer progression. RPSA emerged as a crucial hub gene, consistently identifed across diferent centrality measures. Tis
study sheds light on the potential of computational methods and omics data analysis in improving lung cancer diagnosis and
treatment, ofering new insights for future research directions and personalized medicine strategies.