Enhancing SMOTE-ENN Efficacy on Imbalanced Datasets Using Decision Tree Leaf Feature Extraction: A Case Study on Student Employability Data
Abstract
This study looks at the challenge of classifying tabular data that is highly imbalanced and overlapping, where standard predictive models often lose performance and tend to focus too much on the majority class. Another problem is that many advanced ensemble models are highly complex and lack transparency. These models are often viewed as black boxes, making it difficult for users to clearly and explain how each feature contributes to the final prediction result.This study offers a hybrid classification approach to address the problem, by combining rule extraction from decision tree leaves, SMOTE-ENN resampling technique, and XGBoost algorithm to improve prediction performance more accurately and reliably.The leaf extraction process helps reorganize the data by separating overlapping class regions into clearer and more structured groups before synthetic samples are generated. The test results show that the proposed approach is able to exceed the performance of the baseline model, by obtaining an F1-score of 0.8554 which indicates increased effectiveness and balance in prediction. In addition to improving performance, this method also keeps the model interpretable. Instead of relying only on abstract engineered features, the model allows us to trace important features back to the original decision tree rules. This approach helps explain the prediction formation process more transparently, so that each model decision can be understood clearly, logically, and easily interpreted. Overall, the combination of Decision Tree, SMOTE-ENN, and XGBoost is effective in handling extreme class imbalance, while producing a clear, stable, and easy-to-understand model, making it more reliable and trustworthy in various real-world applications.
Article Metrics
Abstract: 10 Viewers PDF: 4 ViewersKeywords
Full Text:
PDFRefbacks
- There are currently no refbacks.

Journal of Applied Data Sciences
| ISSN | : | 2723-6471 (Online) |
| Publisher | : | Bright Publisher |
| Website | : | http://bright-journal.org/JADS |
| : | taqwa@amikompurwokerto.ac.id (principal contact) | |
| support@bright-journal.org (technical issues) |
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0




.png)