Stacking Ensemble with SMOTE for Robust Agricultural Commodity Price Prediction under Imbalanced Data

Yessica Siagian, Jeperson Hutahaean, Neni Mulyani

Abstract


The volatility of agricultural commodity prices presents a substantial obstacle in the agribusiness sector, especially in supporting timely and data-driven decision-making. This volatility is primarily caused by the imbalanced distribution of historical price data and the complex, often nonlinear nature of price patterns. To address this challenge, this study proposes a novel predictive modeling approach by integrating Stacking Ensemble Learning and Synthetic Minority Over-sampling Technique (SMOTE). The dataset used in this research consists of 5,558 records and 9 features, sourced from a publicly available Kaggle dataset. The target variable daily price was transformed into three classes: low, medium, and high, using a quartile-based discretization approach to enable multiclass classification. The main objective is to evaluate whether stacking combined with SMOTE can improve model performance compared to baseline models that use individual algorithms. A total of eight models were constructed and compared: four baseline models using SMOTE only, and four stacking models integrating SMOTE. The experimental results demonstrate that the proposed model Decision Tree Regression with Stacking and SMOTE achieved the highest performance, with 98.68% accuracy, an F1-score of 0.9868, Cohen’s Kappa of 0.9803, MCC of 0.9803, ROC-AUC of 0.9995, and a log loss of 0.0529. Other optimized models also performed well, such as Random Forest (98.37% accuracy) and Gradient Boosting (98.56%). In contrast, baseline models such as Linear Regression and Decision Tree without stacking achieved only around 67–68% accuracy, with log loss exceeding 0.97. The key contribution of this study is the empirical evidence that combining stacking and SMOTE significantly enhances classification accuracy and model robustness in imbalanced datasets. The novelty lies in applying a deep learning-optimized stacking framework specifically for agricultural commodity price classification, along with a comprehensive multiclass evaluation, offering new insights for practical implementation in agricultural decision support systems.


Article Metrics

Abstract: 2 Viewers PDF: 2 Viewers

Keywords


Agricultural Price Forecasting; Ensemble Machine Learning; Imbalanced Data Handling; Synthetic Oversampling (SMOTE); Stacking Ensemble Regression

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Barcode

Journal of Applied Data Sciences

ISSN : 2723-6471 (Online)
Collaborated with : Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Publisher : Bright Publisher
Website : http://bright-journal.org/JADS
Email : taqwa@amikompurwokerto.ac.id (principal contact)
    support@bright-journal.org (technical issues)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0