Mean-Median Smoothing Backpropagation Neural Network to Forecast Unique Visitors Time Series of Electronic Journal

Unique visitors are first-time IP address visitors in a certain time window, a significant indicator of an electronic journal's performance and accreditation. This study uses a backpropagation neural network (BPNN) to improve visitor prediction. From January 1, 2018, to December 31, 2018, the KEDS.csv file on the page contained page views, sessions, visitors, and new visitors. The data is preprocessed using mean and median smoothing. MSE and RMSE are examined and compared to the BPNN model without smoothing and the original data. The BPNN model with mean smoothing, MSE 0.00129, and RMSE 0.03518 with a learning rate of 0.4 on 1-2-1 architecture has the lowest error. Integrating mean and median smoothing techniques considerably enhances the BPNN forecasting model. Mean, and median smoothing lowers data volatility and noise, making predictions more accurate. The BPNN model with smoothing has lower MSE and RMSE than the model without smoothing and the original data. This work is unusual in combining mean and median smoothing with a BPNN model to predict unique visits to electronic journals. This research advances time series forecasting by predicting electronic journal visiting patterns. The literature benefits from evaluating smoothing strategies and their effects on predicting. The study helps electronic journal practitioners evaluate visitor patterns and journal performance by boosting prediction accuracy.


Introduction
Journals are an essential component of the library collection, the world's most important platform for scientific communication [1].Electronic Journal is a periodical publication published in digital format and displayed on a computer screen [2].Several aspects are managed in electronic journals, such as page views, visitors, new visitors, and sessions [3].Session or unique visitors are the number of visitors from a single IP address in a given time frame [4].The large number of unique daily average subscriber visits to the electronic journal pages indicates that this scientific journal is in high demand [5] [6].The more unique visits per day, hence more credentials will be received.The number of unique visitors is an essential indicator of the success of an electronic journal as a measure of distribution that will speed up the accreditation process for journals [7].
Unique visitor data from electronic journals shall be reported or recorded in a static web report [8].Characteristics of data collected from static web reports are volatile.The instability of the data on the results of the electronic journal statistical reports is due to specific situations, such as the days leading up to the examination and others [9].An electronic journal unique visitor count forecasting system evaluates visitor trends.Forecasting or predicting activities are those conducted by a researcher to forecast future events using a particular scientific methodology [10].Analyzing the progression of time series data patterns makes it possible to forecast the number of unique visitors to electronic journals [11] One method that might be used to forecast the number of unique visitors to electronic journals based on time series data is the backpropagation neural network (BPNN) [13].The advantages of BPNN as a forecasting procedure are that it does not have assumptions about the distribution of data, and it can overcome various time series data patterns, cases of noisy data, missing data, and unstructured data [14].BPNN could complete the work that linear programs cannot perform, the network could proceed even when an element keeps failing in BPNN, and BPNN can identify patterns and trends that are too complicated for humans or other computational techniques to recognize [15].In addition, BPNN relies heavily on input and output data in building modeling, hence data quality is essential in building a good BPNN model [16].One way to improve data quality is by smoothing data using the mean and median methods [17].
The advantage of smoothing is that the processed data is relatively small but generates satisfactory accuracy and could be used to detect and remove input data outliers based on time series data [18].A further advantage of smoothing is that it is easy to understand and implement [19].This research would use smoothing to improve the quality of BPNN input data.Therefore, it is expected to predict time series data for unique visitors to electronic journals with small forecast error results.This study's novelty or research gap lies in applying mean and median smoothing techniques in combination with a BPNN for forecasting the time series data of unique visitors to electronic journals.

Integration of Smoothing Techniques:
The study integrates mean and median smoothing techniques into the BPNN forecasting model.While BPNN is a well-known method for time series forecasting, smoothing techniques enhance the data quality and reduce fluctuations in the input data.By applying mean and median smoothing, the study aims to improve the accuracy and reliability of the forecasting model.

Evaluation of Smoothing Methods:
The study evaluates the performance of two different smoothing methods, mean and median, in the context of unique visitor forecasting.It compares the results of the BPNN model with mean smoothing, median smoothing, and the original data.By examining the impact of smoothing on forecasting accuracy, the study contributes to understanding the effectiveness of these techniques in handling time series data for electronic journal visitors.3. Forecasting Unique Visitors of Electronic Journals: The focus of the study on forecasting unique visitors specifically for electronic journals is another aspect that adds novelty.Unique visitors are an essential indicator of the success and popularity of electronic journals, and accurate forecasting can aid in evaluating visitor trends, assessing journal performance, and accelerating the journal accreditation system.By addressing this specific forecasting task, the study caters to the needs of the electronic journal community and provides insights into predicting visitor patterns.

Literature Review
Visitor forecasting in electronic journals has gained significant attention in recent years due to its relevance in resource allocation, user engagement strategies, and content optimization.Several studies have explored various approaches and techniques to improve the accuracy of visitor predictions.First, the multilayer perceptron (MLP) forecasting method yielded an RMSE of 0.137826 [20].Second, the RMSE of a single exponential smoothing MLP for predicting unique visitors is 0.05981 [3].Third, utilizing BPNN to calculate MAPE 0.301 [21].Fourthly, research involving long short-term memory (LSTM) yielded an RMSE of 13.76 [22].As a final result, LSTM smoothing 149 yields a MAPE of 0.08098 [6].These techniques have been demonstrated to be effective in solving prediction issues.While the existing literature has explored various techniques and approaches for visitor forecasting in electronic journals, there is a need for further research that investigates the integration of mean and median smoothing techniques with neural networks (NN) explicitly, mainly BPNN.
BPNN is a popular artificial neural network (ANN) method for forecasting and predicting complicated issues.BPNNs are feedforward NN with input, hidden, and output layers.A supervised learning method adjusts the network weights and biases depending on the error between expected and actual outputs.BPNN minimizes prediction error by iteratively propagating error back through the network layers and changing connection weights [23].The approach updates network parameters using the error function gradient concerning weights and biases.The BPNN approximates data patterns and makes accurate predictions by repeating this procedure [24].Pattern identification [25], image processing [26], time-series forecasting [27], and financial prediction [28]  This work attempts to solve this research gap by presenting an innovative method that combines mean and median smoothing approaches to preprocess data and strengthen the model's robustness to boost BPNN's performance, increasing the accuracy of unique visitor predictions.This study was carried out in order to fulfill these goals.

Data Collection
The data used in this research was time series activities of the Universitas Negeri Malang journal portal, named Knowledge Engineering and Data Science (KEDS) which is included in the StatCounter [3].The process of retrieving this dataset was done by downloading the .csvfile on the page with the file's name downloaded as summary-11304011.csv.From January 1, 2018, to December 31, 2018, the dataset was selected with attribute selection [29] only using the sessions attribute shown in Table 1.The data obtained is essential because it was used as input in this forecasting process.The dataset was then tested with training data from January 1 to September 12, 2018.A testing process would then analyze the data.

Preprocessing
Once the data is collected, it often requires preprocessing to make it suitable for analysis.The goal is to ensure the data is in a consistent and usable format for the subsequent analysis steps.In this research, the preprocessing used was smoothing and data normalization.Smoothing is a technique that continuously enhances forecasting by aggregating the previous values of a time series and reducing their values [30].Smoothing is also a time series analysis method for predicting future values by assigning weighted values to previous observations [31].This smoothing stage minimizes the fluctuation value in the data used [32].This study's smoothing employed the mean and median smoothing methods because both methods have different calculation parameters.Mean, and median are essentially used in the filtering process for image processing in the field of image science [33][34] [35].Figure 2 illustrates time series data smoothing preprocessing estimate of the number of distinct visitors to an online journal.In addition, these two methods have their respective advantages depending on the shape of the data [36].For mean smoothing, it is proper to use when measuring the overall concentration of locations on randomly occurring data, while median smoothing is proper for using data that has a sequence.

Figure 2. Mean and Median Smoothing
It requires data normalization to enable BPNN to recognize data as input on its weights [37].The data are normalized so that the network output corresponds to the employed activation function.The data must be normalized at intervals more minor than the range of the employed activation function [38].In this research, the activation function utilized was the sigmoid activation function.The sigmoid function is asymptotic (it never reaches 0 or 1) [39], so the data transformation was performed at lesser intervals [0.1; 0.9] [40].The data normalization formula used in this study is presented in (1) [41]. (1)

Forecasting Process
This stage would present a system designed based on the results of data analysis, and designing the BPNN architecture was used as a forecasting method for the system.This network has three layers: the input layer, the Vol.

151
hidden layer, and the output layer [42].This hidden layer supported the network to recognize more input patterns than the network that did not have a hidden layer.A BPNN algorithm with a sigmoid activation function was used to construct the NN [43][44].In the NN, the activation function was utilized to calculate the actual output value in the concealed layer and the output value in the output layer.In this study, in each existing model, four architectures would be built to be used, believe 1-2-1, 1-4-1, 1-6-1, and 1-8-1 [28] [45].There were three models to be made using the BPNN method with different input data, which will be explained as follows.
Model 1 is a model that uses the original input data would be directly used in the forecasting process using the BPNN.Model 1's flux is depicted in Figure 3.The mean was the average value obtained from the sum of all the data values.Then it was divided by the number of data available [46] as in ( 2). ( For the mean smoothing method, the parameters used in this study were the actual data, the amount of data, and the total amount of data available.In Model 3, the data were first processed using the median smoothing then processed using BPNN.If the number of data points is uneven, the median is the value at the midpoint of the sequence.If the number of data points is even, however, the median is the sum of the two values located in the midpoint of the data set [47], then the result is divided into two as follows (3). ( In the median, the smoothing parameter used in this study was the actual data and the data arrangement from the smallest to the most significant value (the median value of the data).

Evaluation
The testing procedure would use data on the number of unique visitors to an electronic journal from September 13, 2018, to December 31, 2018.After testing the data, the next stage was to evaluate the forecasting results and assess the method's efficacy by calculating the error value [28].The error value was calculated using MSE, as shown in (4), and RMSE, as shown in ( 5) [48]. (5) Based on the MSE and RMSE error values, the forecasting model with the most outstanding performance was identified.The smaller the MSE and RMSE values, the more accurate the forecasting results and the superior the method [49].

Result and discussion
In order to forecast the number of distinct visitors to electronic journals, it is necessary to collect data about the number of journal visitors.One hundred ten instances of unique visitor data for electronic journals were tested.Table 2 provides a comparison of the user input data.Table 2 presents the sample data (days 1-10) used as input to BPNN.The change (reduction or addition) in the value of the smoothing process in Model 2 with mean smoothing and Model 3 with median smoothing is more stable than Model 1 with the original data, as shown in Table 2. Figure 5 depicts a more comprehensive comparison of the original and filtering data.Considering Figure 6 (a), it can be seen from the graph that the mean smoothing resulted in a close distance when compared to the original data.The figure shows that the trend generated on the mean smoothing was similar to the original data, even though the daily fluctuations were not as high as the original data.This shows that smoothing techniques could smoothen fluctuations in time series data.Figure 6 (b) reveals that the resulting trend in the median smoothing was similar to the mean smoothing because the calculation results of the mean median did not produce identical values but also had different value ranges.At the median smoothing distance, every day's fluctuation was higher but not as high as the original data.This confirms that the smoothing technique could smooth the fluctuation of the existing time series data.The model's accuracy was measured using MSE and RMSE with four different architectures and nine learning rates ranging from 0.1 -0.9 [50] and epoch 10000.Experiments carried out using Model 1 were experiments using the original session data input, the results of which were displayed in Table 3.The architecture 1-8-1 with a 0.7 learning rate of 0.02619 produced the minimum MSE, as shown in Table 3.The second-order architecture with the minimum error value was 1-6-1-1, with a learning rate of 0.2 0.02624.The architecture with the smallest RMSE was 1-8-1, which had a learning rate of 0.7, or 0.16186, followed by 1-6-1, which had a learning rate of 0.2, or 0.16200.The best architecture based on the evaluation was architecture 1-8-1 with a learning rate of 0.7, as the MSE and RMSE values for this architecture were the most minor compared to other architectures and learning rates, 0.02619 and 0.16186, respectively.Figure 7 depicts the graphical consequences of the 1-8-1 architecture.Figure 7 shows that the graph of the results of forecasting model 1 tends to produce forecasting values with a relatively large coverage distance between the forecast data and the original data.Table 4 displays the outcomes of the investigation conducted using Model 2. Compared to other learning rates, the MSE error produced by the 0.4 learning rate with the 1-2-1 architecture was the smallest at 0.00123.The second-order architecture with the minimum error value was 1-2-1, with a learning rate of 0.1 of 0.00129.The 1-2-1 architecture with a learning rate of 0.4 has the minimum RMSE value, which is 0.03518, followed by the 1-2-1 architecture with a learning rate of 0.1, which is 0.03601.The best architecture based on the evaluation was the 1-2-1 architecture with a learning rate of 0.4, as the MSE and RMSE values for this architecture are the most minor compared to other architectures and learning rates, at 0.00123 and 0.03601, respectively.Figure 8 depicts the graphical consequences of the 1-2-1 architecture.Figure 8 shows that the graph of the results of forecasting model 2 tends to produce smoother forecasts, with the predicted value approximating the actual data.The architecture 1-8-1 with a learning rate of 0.4 of 0.00293 yielded the lowest MSE value, as shown in Table 5.The second order of the minimal error value then belonged to the 1-2-1 architecture with a learning rate of 0.00329 per unit of time.Like MSE, architecture 1-8-1, with a learning rate of 0.4, had the minimum RMSE value of 0.05415, followed by architecture 1-2-1, with a learning rate of 0.1 and 0.05738.The best architecture based on the evaluation was architecture 1-8-1 with a learning rate of 0.4, as the MSE and RMSE values for this architecture were the most minor compared to other architectures and learning rates, namely 0.00293 and 0.05415, respectively.Figure 9 depicts the graphical consequences of the 1-8-1 architecture.

Table 5. Forecasting Model 3 Results
Learning Rate It can be seen in Figure 9 that the graph of the results of forecasting Model 3 obtained better forecasting results when compared to Figure 7 and was relatively smooth when compared to Figure 8 because, from the results at the beginning of the testing data, there was a forecasting result value that was relatively far from the original data.It can be seen that BPNN's performance in forecasting a measure of the number of unique visitors was relatively good, as demonstrated in Table 6.Table 6.Comparison of Overall Forecasting Results

Models
Based on the existing evaluation in Table 6, it is found that Model 2 has a good performance between Model 1 and Model 3. Model 2's MSE was 0.00129 and RMSE 0.03518 with 1-2-1 architecture.This confirms that the backpropagation neural network with input data using the mean can be used to predict the number of unique visitors.
The following are the forecast results for the next week, the next month, and the next six months from Model 2.  10 illustrates a graphical visualization of the forecasting results for the next week in Table 7.Based on the existing graph illustrates that the trend generated by the forecast data has the same quality as the target data.If the target value increases, the forecast value will also increase.If the target value decreases, the forecast value will also decrease.This shows that smoothing can support BPNN to forecast the number of unique visitors, as t three and five days on the graph show that the forecasting results have a slight difference, respectively 0.025 and 0.096.9. Based on the existing graph shows that the trend generated by the forecast data has the same fluctuation graph as the target data, where the daily fluctuations are not as high as the original data.If the target value increases, the forecasting result value will also increase, as well, if the target value decreases, the value of the forecasting results will also decrease.This shows that smoothing can support BPNN and be used to forecast the number of unique visitors as t one to five days on the graph show that the forecasting results have a minimal difference, which is in the range 0.02 -0.1, as shown in Table 9.In conclusion, using smoothing techniques, particularly mean smoothing, combined with the BPNN model offers several advantages for forecasting the number of unique visitors.By reducing noise, extracting underlying patterns, handling outliers, and improving generalization, this approach enhances the accuracy and reliability of the predictions [51].Applying smoothing techniques with BPNN demonstrates its potential as a valuable tool for forecasting visitor numbers, benefiting electronic journal publishers and administrators in their strategic decision-making processes.The results of the experiments in this study can be compared in The results obtained from this study demonstrate the effectiveness of integrating mean and median smoothing techniques with the BPNN model for unique visitor forecasting in electronic journals.The proposed approach outperformed the baseline models regarding forecasting accuracy, indicating its potential practical utility in electronic journal management.One of the critical implications of these findings is that integrating smoothing techniques helps address the challenges posed by the volatility and irregularity of visitor data.By smoothing out the noise and fluctuations, the forecasting model becomes more robust and reliable [52], providing more accurate predictions of future visitor counts.This is particularly valuable for electronic journal platforms that rely on visitor traffic for better decision-making in the journal's content management, resource allocation, and marketing strategy processes [53].By implementing this approach, electronic journal publishers and administrators can enhance their operational efficiency, optimize revenue generation, and provide a better user experience for their readers.

Conclusion
This study aimed to enhance the accuracy of unique visitor forecasting for electronic journals by integrating mean and median smoothing techniques with the BPNN model.The results demonstrated the proposed approach's effectiveness, as it outperformed the baseline models regarding forecasting accuracy.Model 2 (a BPNN model with mean smoothing) provides the minimum error with a learning rate of 0.4 on 1-2-1 architecture, with MSE 0.00123 and RMSE 0.03518.However, several avenues for future research and further studies can be pursued to enhance our understanding and refine unique visitor forecasting for electronic journals.Future research can focus on addressing real-time processing (RTP) issues.This includes exploring other techniques like exponential or wavelet smoothing, which may capture different patterns and variations in visitor data, leading to improved forecasting performance.Additionally, conducting comparative studies with deep learning algorithms such as Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) can provide a better understanding of different modeling approaches for unique visitor forecasting.Evaluating the proposed approach on a larger and more diverse dataset encompassing multiple electronic journals from different disciplines or regions would also be valuable to validate its generalizability and robustness.By pursuing these future research directions, the field of unique visitor forecasting can advance further, and electronic journal platforms can benefit from improved forecasting accuracy, aiding in content management, resource allocation, and marketing strategies, while also addressing real-time processing challenges.

Figure 1
Figure 1 represents the visual representation of this research methodology, showcasing the flow and sequence of these stages in the research process.Each stage contributes to the overall research objective and helps ensure the validity and reliability of the findings.A more complete explanation of each stage carried out is in the following section.

Figure 3 .
Figure 3. Forecasting Model 1For Model 2, the input data is used as a result of mean smoothing.Figure4illustrates the flow for Model 2.

Figure 5
depicts the Model 3 flow diagram.

Figure 7 .
Figure 7.The Graph of The Best Result Obtained from Forecasting Model 1

Figure 11 .
Figure 11.One Month Forecasting Results GraphFigure11shows a graph of the forecasting results in Table8.Based on the graphic visualization in Figure11indicates a striking change in the graph fluctuations, particularly on the 10th day to the 20th day or during the middle of the month, where the existing forecast data is still far away from the highest point of the lowest position of the

Figure 12 .
Figure 12.Six Months Forecasting Results GraphFigure12illustrates a graphical visualization of the forecasting results in Table9.Based on the existing graph shows that the trend generated by the forecast data has the same fluctuation graph as the target data, where the daily fluctuations are not as high as the original data.If the target value increases, the forecasting result value will also increase, as well, if the target value decreases, the value of the forecasting results will also decrease.This shows that smoothing can support BPNN and be used to forecast the number of unique visitors as t one to five days on the graph show that the forecasting results have a minimal difference, which is in the range 0.02 -0.1, as shown in Table9.In conclusion, using smoothing techniques, particularly mean smoothing, combined with the BPNN model offers several advantages for forecasting the number of unique visitors.By reducing noise, extracting underlying patterns, handling outliers, and improving generalization, this approach enhances the accuracy and reliability of the predictions[51].Applying smoothing techniques with BPNN demonstrates its potential as a valuable tool for forecasting visitor numbers, benefiting electronic journal publishers and administrators in their strategic decision-making processes.The results of the experiments in this study can be compared in Table10.
4.Comparison and Performance Evaluation: The study compares the performance of different models (BPNN, BPNN with mean smoothing, and BPNN with median smoothing) using evaluation metrics such as Mean Square Error (MSE) and Root Mean Square Error (RMSE).The evaluation helps identify the model with the slightest error and demonstrates the effectiveness of incorporating smoothing techniques into the BPNN model for forecasting unique visitors.

Table 2 .
Data Input Comparison

Table 7 .
Forecasting Results for the Next Week

Table 7
illustrates the results of forecasting data for the next week, which resulted from 1-2-1 architecture with a learning rate of 0.4.The data used to predict the forecasting results for the next week employed training data of 358 days with seven days of testing data.Wibawa et al. / JADS Vol. 4 No. 3 2023 Vol. 4, No. 3, September 2023, pp.147-162

Table 8 .
Forecasting Results for the Next Month

Table 8 shows
the results of forecasting data for the next month.The forecast values shown in Table8result from the 1-2-1 architecture with a learning rate 0.4.The data used to predict the forecasting results for the next month employed training data of 335 days with testing data of 30 days.

Table 9 .
Forecasting Results for the Next Six Months

Table 9
presents the results of forecasting data for the next six months, obtained from the 1-2-1 architecture with a learning rate of 0.4.The data used to predict the forecast results for the next six months employed 183 days of training data with 182 days of testing data.

Table 10 .
Previous Research Model Compared to Proposed BPNN Smoothing

Table 10
provides a comparison of the previous research models with the proposed BPNN Smoothing technique in terms of MSE and RMSE.The previous research models include MLP and LSTM networks, as well as MLP combined with Single Exponential Smoothing.The results show that the proposed BPNN Smoothing technique outperforms the previous models in terms of both MSE and RMSE.The MSE value for the proposed BPNN Smoothing is significantly lower at 0.00123 compared to the previous best result of 0.01899 achieved by MLP.Similarly, the RMSE value for the proposed technique is substantially lower at 0.03518 compared to the best previous result of 0.13780 achieved by MLP.These findings indicate the superior performance of the proposed BPNN Smoothing technique in predicting visitor counts in electronic journals, highlighting its potential for practical implementation and improved accuracy.