Predicting Student Performance for Early Intervention using Classification Algorithms in Machine Learning
Subject Areas : Machine learningKalaivani K 1 * , Ulagapriya K 2 , Saritha A 3 , Ashutosh Kumar 4
1 - Vels Institute of Science, Technology and Advanced Studies, India
2 - Vels Institute of Science, Technology and Advanced Studies, India
3 - Vels Institute of Science, Technology and Advanced Studies, India
4 - Vels Institute of Science, Technology and Advanced Studies, India
Keywords: Machine Learning, Classification, Supervised Machine Learning, Data Analysis, Naïve Bayes,
Abstract :
Predicting Student’s Performance System is to find students who may require early intervention before they fail to graduate. It is generally meant for the teaching faculty members to analyze Student's Performance and Results. It stores Student Details in a database and uses Machine Learning Model using i. Python Data Analysis tools like Pandas and ii. Data Visualization tools like Seaborn to analyze the overall Performance of the Class. The proposed system suggests student performance prediction through Machine Learning Algorithms and Data Mining Techniques. The Data Mining technique used here is classification, which classifies the students based on student’s attributes. The Front end of the application is made using React JS Library with Data Visualization Charts and connected to a backend Database where all student’s records are stored in MongoDB and the Machine Learning model is trained and deployed through Flask. In this process, the machine learning algorithm is trained using a dataset to create a model and predict the output on the basis of that model. Three different types of data used in Machine Learning are continuous, categorical and binary. In this study, a brief description and comparative analysis of various classification techniques is done using student performance dataset. The six different machine learning Classification algorithms, which have been compared, are Logistic Regression, Decision Tree, K-Nearest Neighbor, Naïve Bayes, Support Vector Machine and Random Forest. The results of Naïve Bayes classifier are comparatively higher than other techniques in terms of metrics such as precision, recall and F1 score. The values of precision, recall and F1 score are 0.93, 0.92 and 0.92 respectively.
[1] F.Y. Osisanwo, J.E.T.Akinsola, O. Awodele, J.O. Hinmikaiye, O. Olakanmi, and J. Akinjobi, “Supervised Machine Learning Algorithms: Classification and Comparison”, International Journal of Computer Trends and Technology, Vol. 48, No. 3, 2017, pp. 128-138.
[2] H. Al-Shehri et al., "Student performance prediction using Support Vector Machine and K-Nearest Neighbor," in IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE), 2017, pp. 1-4.
[3] S. Hossain, D. Sarma, F. Tuj-Johora, J. Bushra, S. Sen and M. Taher, "A Belief Rule Based Expert System to Predict Student Performance under Uncertainty", in 22nd International Conference on Computer and Information Technology (ICCIT), 2019, pp. 1-6.
[4] E. S. Bhutto, I. F. Siddiqui, Q. A. Arain and M. Anwar, "Predicting Students’ Academic Performance Through Supervised Machine Learning",in International Conference on Information Science and Communication Technology (ICISCT), 2020, pp. 1-6.
[5] Khalil Ahammad, Partha Chakraborty, Evana Akter, Umme Fomey, and Saifur Rahman, “A Comparative Study of Different Machine Learning Techniques to Predict the Result of an Individual Student Using Previous Performances”, International Journal of Computer Science and Information Security, Vol. 18, No. 1, 2021, pp. 5-10.
[6] M. B. Shah, M. Kaistha and Y. Gupta, "Student Performance Assessment and Prediction System using Machine Learning", in 4th International Conference on Information Systems and Computer Networks (ISCON), 2019, pp. 386-390.
[7] Nurafifah Mohammad Suhaimi, ShahAlam, Selangor, Shuzlina Abdul-Rahman, Sofianita Mutalib, Nurzeatul Hamimah Abdul Hamid, Ariff Md Ab Malik, “Review on Predicting Students’, Graduation Time Using Machine Learning Algorithms”, International Journal of Modern Education and Computer Science, Vol. 7, 2019, pp. 1-13.
[8] Fan Yang, Frederick W.B. Li, “Study on student performance estimation, student progress analysis, and student potential prediction based on data mining”, Computers & Education, Vol. 123, 2018, pp. 97-108.
[9] Reynold A. Rustia, Ma. Melanie A. Cruz, Michael Angelo P. Burac, Thelma D. Palaoag, “Predicting Student's Board Examination Performance using Classification Algorithms”, in 7th International Conference on Software and Computer Applications, 2018, pp. 233–237.
[10] Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining Concepts and Techniques”, USA: Morgan Kaufmann Publishers Elsevier, 2012.
[11] Saishruthi Swaminathan, “Logistic Regression – Detailed Overview”, Canada: Towards Data Science – Medium Publication, 2018.
[12] Mandy Sidana, “Intro to types of Classification Algorithms in Machine Learning”, USA: Medium Publication, 2017.
[13] Gongde Guo, Hui Wang, David Bell, Yaxin Bi, Kieran Greer, “KNN Model-Based Approach in Classification”, Lecture Notes in Computer Science, Vol. 2888, 2003, pp. 986-996.
[14] Simon Tong, Daphne Koller, “Support Vector Machine Active Learning with Applications to Text Classification”, Journal of Machine Learning Research, Vol. 2, No.1, 2001, pp. 45-66.
[15] H Zhang, J Zimmerman, D Nettleton, DJ Nordman, “Random forest prediction intervals”, The American Statistician, Vol.74, Issue 4, 2019, pp. 392-406.
[16] Farhad Malik, “Must Know Mathematical Measures for Every Data Scientist”, USA: Medium Publication, 2018.
[17] R.J. Howarth, “r2 (R Squared)”, Dictionary of Mathematical Geosciences: With Historical Notes, Switzerland: Springer, 2017, pp. 503-527.
[18] Aditya Mishra, “Metrics to Evaluate your Machine Learning Algorithm”, Canada: Towards Data Science, Medium Publication, 2018.
[19] Haiyi Zhang, Di Li, “Naïve Bayes Text Classifier”, in IEEE International Conference on Granular Computing, 2007, pp. 708.
[20] Nawzat Sadiq Ahmed, Mohammed Hikmat Sadiq, “Clarify of the Random Forest Algorithm in an Educational Field”, in IEEE International Conference on Advanced Science and Engineering, 2018, pp.179-184.
[21] Dataset,https://archive.ics.uci.edu/ml/datasets/Student+Performance.
[22] Sotiris Kotsiantis, “Supervised Machine Learning: a review of classification techniques”, Informatica, Vol. 31, No. 3, 2007, pp. 249-268.
[23] S.B.Kotsiantis, I.D. Zaharakis, P.E.Pintelas, “Machine Learning: A Review of Classification Techniques and Combining Techniques”, Artificial Intelligence Review, Vol. 26, No. 3, 2006, pp.159-190.
[24] R. Choudhary and H. K. Gianey, "Comprehensive Review on Supervised Machine Learning Algorithms," in IEEE International Conference on Machine Learning and Data Science (MLDS), 2017, pp. 37-43.
[25] C. A. Ul Hassan, M. S. Khan and M. A. Shah, "Comparison of Machine Learning Algorithms in Data classification," in 24th International Conference on Automation and Computing (ICAC), 2018, pp. 1-6.
[26] O. Obulesu, M. Mahendra and M. Thrilok Reddy, "Machine Learning Techniques and Tools: A Survey," in International Conference on Inventive Research in Computing Applications (ICIRCA), 2018, pp. 605-611.