Word Prediction from Medical Prescription via Transfer Learning with Pre-trained CNN, GAN and BiLSTM Integration

Abstract: This research presents a novel method for recognizing handwritten medical prescriptions by integrating Bidirectional Long Short-Term Memory (BiLSTM) networks with Generative Adversarial Network (GAN)-augmented Convolutional Neural Networks (CNN). The CNN architecture includes multiple convolutional layers with max-pooling, and ReLU, trained using Adam optimizer and categorical cross-entropy loss over multiple epochs with batches. To enhance dataset robustness, a GAN was employed: the Generator produced synthetic handwritten images, and a CNN-based Discriminator distinguished between real and synthetic images through adversarial training. High-quality synthetic data from the GAN was integrated with the original dataset for CNN training. Integration of BiLSTM with the GAN-augmented CNN involved utilizing CNN-extracted features in two BiLSTM layers to capture sequential dependencies in formatted handwritten text via transfer learning. Evaluations demonstrated significant improvements in recognition metrics, achieving a maximum accuracy of 93.42%, precision of 93%, recall of 92%, and F1 score of 92%. This framework boosts recognition accuracy as well as variability in medical handwriting, marking substantial progress in medical text recognition through advanced data augmentation and sequential learning models. The proposed research surpasses existing models in accuracy and effectiveness in medical text recognition.
Published in: 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON)