Gene Expression–Guided Deep Hybrid Models for Robust Lung Cancer Classification and Diagnosis

Authors

  • Manoj B. Mandake, Rahul N. Patil, Suchita Walke, Sanjay S. Jadhav, Yogesh B. Mandake Author

Keywords:

Gene expression, deep learning, hybrid model, CNN, LSTM, lung cancer, genetic algorithm.

Abstract

Lung cancer continues to be the foremost cause of cancer-related deaths globally, accounting for approximately one in five cancer fatalities each year. Despite significant progress in diagnostic imaging and molecular profiling, early detection remains a persistent challenge due to tumor heterogeneity, overlapping histopathological features, and complex molecular signatures. Gene expression analysis has emerged as a powerful tool to understand the biological mechanisms of carcinogenesis and identify potential biomarkers for precision diagnostics. However, the high dimensionality, noise, and intricate correlations inherent in gene expression datasets limit the performance of conventional statistical and machine learning models.

To address these challenges, this study introduces a Gene Expression–Guided Deep Hybrid Model (GE-DHM) that integrates Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and Genetic Algorithms (GA) to achieve robust, biologically interpretable lung cancer classification. The proposed framework utilizes GA for optimal gene feature selection, thereby reducing redundancy and dimensionality, followed by CNN and LSTM layers to capture spatial and sequential dependencies in the selected gene profiles. By embedding gene expression guidance within the deep learning structure, the model learns biologically relevant features that enhance both predictive performance and interpretability.

Experimental validation using publicly available lung cancer gene expression datasets from The Cancer Genome Atlas (TCGA-LUAD/LUSC) demonstrated that GE-DHM outperforms traditional models, achieving a classification accuracy of 96.4%, with significant improvements in precision, recall, and F1-score metrics. Furthermore, pathway enrichment analysis revealed that top-ranked genes identified by the model were strongly associated with critical oncogenic signaling pathways, including EGFR, KRAS, and TP53, confirming the model’s biological relevance.

The findings of this research highlight the potential of hybrid deep learning frameworks in integrating molecular-level insights with computational intelligence for reliable cancer diagnosis. The GE-DHM establishes a robust platform for precision oncology, paving the way for early detection, personalized treatment strategies, and enhanced clinical decision-making in lung cancer management.

Downloads

Published

2025-11-12