Deep Learning for Oral Cancer Detection Using ResNet-50, Bi-LSTM, and Multimodal Fusion

Authors

  • Keshika Jangde, Ranu Pandey Author

DOI:

https://doi.org/10.64149/J.Ver.8.18s.348-356

Keywords:

Oral Cancer, ResNet-50, Bi-LSTM, Multimodal Fusion, Conditional GAN

Abstract

Oral cancer represents one of the major health concerns worldwide. More importantly, despite treatments with advanced modalities, it still leads to a high level of morbidity and mortality, mainly occurring late. The present detection methods available traditionally in this disease rely on very limited data, which brings low performance in the early stages of the disease. We proposed an all-inclusive deep learning framework combining an advanced CNN-RNN strategy along with the transfer learning concept to enhance robustness for oral cancer detection tasks. This involves four novel techniques. We take a pre-trained ResNet-50 model, tuned for datasets relative to cancers that improve visual classification up to 92% from 85%. A Bi-LSTM network captures the temporal dependencies in the sequence of data and improves the accuracy of disease progression prediction from 78% to 88%. The third approach is multimodal fusion, which combines BERT's clinical text multimodal fusion with the features of histopathological images from ResNet-50. This shows the fusion of textual and visual diagnostic information, achieving an overall classification accuracy of 95%. Finally, we use cGANs to synthesize some cancer images, handle data imbalance and boost model robustness by 5%. This increases the accuracy in early detection as well as reduces false negative cases regarding about 10% of all early-stage cancers. In comparison with traditional techniques, our model, which possesses the mechanism of domain-specific transfer learning, sequential analysis of data, multimodal fusion, and data augmentation, shows better performance and may be a new approach toward early diagnosis and treatment of oral cancers.

Downloads

Published

2024-12-08

How to Cite

Deep Learning for Oral Cancer Detection Using ResNet-50, Bi-LSTM, and Multimodal Fusion. (2024). Vascular and Endovascular Review, 7(2), 88-99. https://doi.org/10.64149/J.Ver.8.18s.348-356