Analysing Code-Mixed Text in Programming Instruction Through Machine Learning for Feature Extraction
Abstract
Abstract: In programming education, code-mixed text using multiple languages or dialects simultaneously can significantly hinder learning outcomes due to misinterpretation and inadequate processing by traditional systems. For instance, students with bilingual or multilingual backgrounds may face difficulties with automated code reviews or multilingual coding tutorials if their code-mixed queries are not accurately understood. Motivated by these challenges, this paper proposes a Federated Bi-LSTM Model for Feature Extraction and Classification. This model leverages Bidirectional Long Short-Term Memory (Bi-LSTM) networks within a federated learning framework to effectively accommodate various code-switching methodologies and context-dependent linguistic elements while ensuring data security and privacy across distributed sources. The Federated Bi-LSTM Model demonstrates impressive performance, achieving 99.3% accuracy nearly 19% higher than traditional techniques such as Support Vector Machines (SVM), Multilayer Perceptron (MLP), and Random Forest (RF). This significant improvement underscores the model's capability to efficiently analyse code-mixed text and enhance programming instruction for multilingual learners. However, the model faces limitations in processing highly specialized code-mixed text and adapting to real-time applications. Future research should focus on optimizing the model for these challenges and exploring its applicability in broader domains of computer-assisted education. This model represents a substantial advancement in language-aware computing, offering a promising solution for the evolving needs of adaptive and inclusive programming education technologies. This advancement has the potential to transform language-sensitive computing, providing significant support for multilingual learners and setting a new standard for inclusive programming education.