Multiple Choice Question Difficulty Level Classification with Multi Class Confusion Matrix in the Online Question Bank of Education Gallery
Abstract
The importance of test question planning as a critical element in improving the quality of education is undeniable as it helps teachers evaluate student understanding. The creation of questions must consider the level of difficulty, which is often divided into three categories: easy, medium, and difficult. Predicting the difficulty level of questions has great importance as it helps teachers create test questions that match students' abilities. In this study, we view the identification of item difficulty as a classification problem. The data used includes questions from elementary and junior high school, with various machine learning methods applied to perform classification. We tested Random Forest, Logistic Regression, SVM, Gaussian, and Dense NN, considering embedding, lexical, and syntactic features. The evaluation results show that the best method in identifying the difficulty level of questions in subjects is using Random Forest, resulting in an accuracy of 84%. Meanwhile, in other cases, the best method is also Random Forest, with an accuracy of 80%. Our research shows that the use of feature embedding and TF-IDF has a significant positive impact on the accuracy of the resulting model.
Article Metrics
Abstract: 141 Viewers PDF: 113 ViewersFull Text:
PDF
DOI:
https://doi.org/10.47738/jads.v4i4.132
Citation Analysis:
Refbacks
- There are currently no refbacks.
Journal of Applied Data Sciences
ISSN | : | 2723-6471 (Online) |
Organized by | : | Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia. |
Website | : | http://bright-journal.org/JADS |
: | taqwa@amikompurwokerto.ac.id (principal contact) | |
support@bright-journal.org (technical issues) |
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0