Modeling and Forecasting Long-Term Records of Mean Sea Level at Grand Isle, Louisiana: SARIMA, NARNN, and Mixed SARIMA-NARNN Models

This study tried to demonstrate the role of time series models in modeling and forecasting process using long-term records of monthly mean sea level from January 1978 to October 2020 at Grand Isle, Louisiana. Following the Box–Jenkins methodology, the ARIMA(1,1,1)(2,0,0) 12 with drift model was selected to be the best fit model for the time series, according to its lowest AIC value. Using the LM algorithm, the results revealed that the NARNN model with 9 neurons in the hidden layer and 6 time delays provided the best performance in the nonlinear autoregressive neural network models at its smaller MSE value. The Mixed model, a combination of the SARIMA and NARNN models has both linear and nonlinear modelling capabilities can be a better choice for modelling the time series. The comparative results revealed that the Mixed-LM model with 9 neurons in the hidden layer and 3 time delays yielded higher accuracy than the NARNN-LM model with 9 neurons in the hidden layer and 6 time delays, and the ARIMA(1,1,1)(2,0,0) 12 with drift model, according to its lowest MSE in this study. Thus, this study may provide an integrated modelling approach as a decision-making supportive method for formulating local mean sea level forecast in advance. Understanding past sea level is important for the analysis of current and future sea level changes. In order to sustain these observations, research programs utilizing the resulting data should be able to improve significantly our understanding and narrow projections of future sea level rise and variability.


Introduction
Climate change has been addressed intensively. At the same time, the issues of sea level rise at an increasing rate has been discussed comprehensively as well. There are two major factors related to climate change caused sea level rise globally: the added water from ice melting (ice sheets and glaciers) into the ocean, and the thermal expansion of warming waters. Locally, the amount and speed of sea level rise varies by location, particularly, the slowing Gulf Stream and sinking land affect some areas at varying rates in the United States (https://sealevelrise.org/). National Oceanic and Atmospheric Administration (NOAA) 2019 Global Climate Annual Report summarized that the global annual temperature has increased at an average rate of 0.07°C (0.13°F) per decade since 1880. This rate (+0.18°C / +0.32°F) of increase has doubled since 1981 (https://www.ncdc.noaa.gov/sotc/global/201913).

Data Source
The long-term records of monthly mean sea level from January 1978 to October 2020 at Grand Isle, Louisiana, used for this study is available to the public from NOAA Tides and Currents (https://tidesandcurrents.noaa.gov/). According to NOAA Tides and Currents, the term mean sea level can refer to a tidal datum, which is locally-derived based on observations at a tide station, and is typically computed over a 19-year period, known as the National Tidal Datum Epoch (NTDE). Mean sea level as a tidal datum is computed as a mean of hourly water level heights observed over 19-years. Monthly means generated in the datum calculation process, which is used to generate the relative local sea level trends observed at a tide station.

Seasonal ARIMA (SARIMA) Model
A time series is a set of observations, each one being recorded at a specific time t. The sequence of random variables {y t : t = 1, 2, ⋯, T} is called a stochastic process and serves as a model for an observed time series. Statistically, the ARIMA(p,d,q) model can be expressed in backshift notation as: where p = the order of the autoregressive process, d = the number of differences required to make the time series stationary, q = the order of the moving average process, ϕ p (B) = (1 -ϕ 1 B -… -ϕ p B p ) = (1 -Σ p i=1 ϕ i B i ), θ q (B) = (1θ 1 B -… -θ q B q ) = (1 -Σ q j=1 θ j B j ), c is a constant, and e t is the error term. The purpose of each of these parts is to make the model better fit to predict future points in the time series [24].
The ARIMA(p,d,q)(P,D,Q) S model is used to represent the SARIMA model that can be written in backshift notation as: where P = the order of the seasonal autoregressive process, D = the number of seasonal differences applied to the time series, Q = the order of the seasonal moving average process, S = the seasonality of the model (i.e., the number of time steps for a single seasonal period), . In time series forecasting, the Box-Jenkins methodology [4] refers to a systematic method of identifying, estimating, checking, and forecasting ARIMA models [5], that can be applied to find the best fit of the time series. The Box-Jenkins methodology also can be used as the process to forecast the SARIMA model in this study based on its autocorrelation function (ACF) and partial autocorrelation function (PACF) as a means of determining the stationarity of the univariate time series and the lag lengths. Thus, the Box-Jenkins methodology starts with the assumption that the time series should be as on stationary status. If the time series is not stationary, it needs to be stationarized through differencing.
Empirically, plots and summary statistics can be used to identify trends and autoregressive elements to get an idea of the amount of differencing and the size of the lag that will be required for model identification. In order to figure out good parameters for the model, Akaike's Information Criterion (AIC) or Bayesian Information Criterion (BIC) can be used to determine the orders of a SARIMA model that is obtained by minimizing the AIC or BIC value. In the diagnostic checking step, plots and statistical tests of the residual errors can be used to determine the model fitting, to evaluate the fitting model in the context of the available data, and to check where the model may be improved.

Nonlinear Autoregressive Neural Network (NARNN) Model
The idea behind the autoregressive (AR) process is to explain the present value of the time series, y t , by a function of p past values, (y t-1 , y t-2 , ⋯, y t-p ). Thus, the AR process of order p, AR(p), is defined by the equation: where ϕ = (ϕ 1 , ϕ 2 , ⋯, ϕ p ) is the vector of model coefficients for the autoregressive process, and e t is white noise, i.e., e t~N (0, σ 2 ). The NARNN is a natural generalization of the classic linear AR(p) process. The NARNN of order p can be expressed as: where Φ(•) is an unknown function determined by the neural network structure and connection weights, w is a vector of all parameters (weights), and ɛ t is the error term. Thus, it performs a nonlinear functional mapping from the past observations, (y t-1 , y t-2 , ⋯, y t-p ), to the future value, y t , which is equivalent to a nonlinear autoregressive model [30].
With the time series data, lagged values of the time series can be used as inputs to a neural network, so-called this the NARNN model. Mathematically, the NARNN model [2] can be written by the equation of the form as: where d is the number of input units, k is the number of hidden units, a 0 is the constant corresponding to the output unit, b 0j is the constant corresponding to the hidden unit j, w j is the weight of the connection between the hidden unit j and the output unit, w ij is the parameter corresponding to the weight of the connection between the input unit i and the hidden unit j, and Φ(•) is a nonlinear function, so-called this the transfer (activation) function. The logistic function (i.e., sigmoid) is commonly used as the hidden layer transfer function, that is, Φ(y) = 1 / (1 + exp(-y)).
The most common learning rules for the NARNN model are the Levenberg-Marquardt, Bayesian Regularization, and Scaled Conjugate Gradient training algorithms. In this study, the Levenberg-Marquardt (LM) algorithm was considered, because it works without computing the exact Hessian matrix. Instead, it works with the gradient vector and the Jacobian matrix, therefore increasing the training speed and has stable convergence [15].

Mixed SARIMA-NARNN (Mixed) Model
The SARIMA and NARNN models are good at modelling linear and nonlinear problems for the time series, respectively. However, using the Mixed model, a combination of the SARIMA and NARNN models has both linear and nonlinear modelling capabilities, can be a better choice for modelling the time series. Assuming that an unknown function can be used to demonstrate the relationship between linear and nonlinear components in the time series, which can be illustrated as follows: where linear component is represented by L t , and nonlinear component is shown by N t . Assuming that the linear and nonlinear components in the time series have simply additive relationships. Zhang [30] states that the time series can be considered as a combination of a linear and nonlinear components as follows: These two components should be estimated from the time series. First, the linear component will be modelled by the SARIMA model in this study. Then, the residuals from the SARIMA model will have only the nonlinear relationship, which can be obtained by taking difference of observed values and predicted values as follows: where e t is the residual of the linear model at time t, and t is the predicted value for time t. To find the nonlinear relationship, residuals can be modelled by the NARNN model in this study as follows: where f is the transformation function modelled by the NARNN model, and ɛ t is the random error. The forecast from the SARIMA and NARNN models are combined to obtain the forecast of the time series ŷ t which is denoted by  In terms of the non-stationary time series, differencing can be used to transform a non-stationary time series into a stationary one. When both trend and seasonality are present, a non-seasonal first difference and a seasonal difference may need to apply in advance. Notice that the graph of the first difference of the time series looked approximately stationary. According to the Augmented Dickey-Fuller Test, Dickey-Fuller = -11.944 with lag order = 7 and the p-value of the test was smaller than 0.01. It rejected the null hypothesis that is non-stationary, and suggested that the first difference of the time series was stationary. The ACF of first difference ( Figure 4) showed a significant ISSN 2723-6471 6 positive spike at the first lag followed by correlations that were statistically significant. The corresponding PACF of first difference ( Figure 5) showed most likely a gradual decrease after the first few lags.

Seasonal ARIMA (SARIMA) Model
Seasonal differencing is defined as a difference between a value and a value with lag that is a multiple of seasonality (S). In this case, S = 12 (months per year) is the span of the periodic seasonal behavior. Figure 5 showed the graph of the 12 th difference of the time series, which looked approximately stationary. At the same time, the test statistic of the Augmented Dickey-Fuller Test was Dickey-Fuller = -37.923 with lag order = 7 and the p-value of the test was smaller than 0.01. It rejected the null hypothesis that is non-stationary, and suggested that the 12 th difference of the time series was stationary. The ACF of 12 th difference ( Figure 6) most likely a steady decay after the first few lags and bounce around between being positive and negative statistically significant. Meanwhile, the PACF of 12 th difference ( Figure 7) mostly looks like a steady negative decay in the partial correlations toward zero. This is consistent with the general recommendation that if the autocorrelation at the first seasonal lag is positive we should use an autoregressive (AR) model vs. a moving average (MA) model. The auto.arima() function from the "forecast" package in R 4.0.2 for Windows was employed to identify both the structure of the time series (stationary or not) and type (seasonal or not), and sets the model's parameters, that takes into account the AIC, AICc or BIC values generated to determine the best fit SARIMA model.
Consequently, the ARIMA(1,1,1)(2,0,0) 12 with drift model was selected to be the best fit model for the time series in this study, according to the lowest AIC value (-1612.92). The ARIMA(1,1,1)(2,0,0) 12 with drift model would yield the following forecasting equation: The ARIMA(1,1,1)(2,0,0) 12 with drift model for the time series, seasonal adjusted monthly mean sea level from January 1978 to October 2020 at Grand Isle, Louisiana, can be expressed as follows: (1 -0.4627B)(1 -B)(1 -0.0542B 12 + 0.0647B 24 )Ŷ t = 0.0008 + (1 + 0.9671B) A common task when building a forecasting model is to check that the residuals satisfy some assumptions that they are uncorrelated, normally distributed, etc. The top figure of Figure 8 showed that the residuals from the ARIMA(1,1,1)(2,0,0) 12 with drift model did not violate the assumption of constant location and scale. The bottom right figure of Figure 8 also showed that the residual histogram did not reveal a time series deviation from normality in this case. The ACF plot of the residuals (the bottom left figure of Figure 8) illustrated that all sample autocorrelations were within the threshold limits, indicating that the residuals appeared to be random.

Nonlinear Autoregressive Neural Network (NARNN) Model
In MATLAB, the NARNN model applied to time series prediction using its past values of a univariate time series can be expressed as follows: where y(t) is the time series value at time t, d is the time delay, and e(t) is the error of the approximation of the time series at time t. The function Φ(•) is an unknown nonlinear function, and the training of the neural network aims to approximate the function by means of the optimization of the network weights and neuron bias. This tends to minimize the sum of the squared differences between the observed (y i ) and predicted (ŷ i ) values (i.e., MSE = (1/n) ∑ i=1 n (y i -ŷ i ) 2 ) [1].
In this study, the NARNN model was applied to model and forecast the time series, monthly mean sea level from January 1978 to October 2020 at Grand Isle, Louisiana. Furthermore, the logistic sigmoid and linear transfer functions at the hidden and output layers were used respectively. The extracted features were trained using the LM training algorithm for the target time series in the MATLAB (2019a) Neural Network Toolbox: 514 timesteps of one element, monthly mean sea level from January 1978 to October 2020 at Grand Isle, Louisiana. Three kinds of target timesteps were set aside for the training, validation and testing phases in this case study. The division of the time series in this analytical work was 70% for the training, 15% for the validation, and 15% for the testing. Randomly, 514 data samples were divided into 360 data for the training, 77 data for the validation, and 77 data for the testing.
The development of the optimal architecture for the NARNN model requires determination of time delays, the number of hidden neurons, and an efficient training algorithm. The optimum number of time delays and hidden neurons were obtained through a trial and error procedure. The prediction performance of the models was evaluated by its MSE. The error analysis showed that the NARNN model with 9 neurons in the hidden layer and 6 time delays provided the best performance (MSE = 2.32078e-3) using the LM algorithm (NARNN-LM). The results revealed the training progress using the LM algorithm stopped when the validation error increased for six iterations with Performance = 0.00213, Gradient = 0.000585, and Mu = 1.00e-05 at epoch 9. In terms of the processing time, the LM algorithm took 0:00:00 during the training process.
As illustrated in the performance plot, the best performance for the validation phase was 0.0028683 at epoch 3 for the NARNN-LM model. The results showed a good network performance because the validation and testing errors had similar characteristics, and did not appear that any significant overfitting has occurred. As shown in the regression plots, the regression R value for the training phase was 0.90731, for the validation phase was 0.87905, for the testing phase was 0.88299, and for the all samples was 0.89865, respectively, indicated good predictive abilities of the NARNN-LM model for new datasets.
The dynamic network time-series response plots were displayed in Figure 10 for the NARNN-LM model, showed that the outputs were distributed evenly on both sides of the response curve, and the errors versus time were small in the training, validation, and testing phases, indicated that the NARNN-LM model was able to predict the time series over the simulation period efficiently. The error autocorrelation function describes how the prediction errors are related in time. For a perfect prediction model, there should only be one nonzero value of the autocorrelation function, and it should occur at zero lag (i.e., MSE). This would mean that the prediction errors are completely uncorrelated with each other (white noise). The correlations for the NARNN-LM model ( Figure 11) except for the one at zero lag, almost all fell approximately within the 95% confidence limits around zero, so the model seemed to be adequate.

Mixed SARIMA-NARNN (Mixed) Model
In MATLAB, the Mixed model applied to time series prediction using its past residuals from the SARIMA model can be expressed as follows: e(t) = Φ(e(t-1), e(t-2), ⋯, e(t-d)) + ɛ(t) (14) where e(t) is the residual of the time series at time t, d is the time delay, and ɛ(t) is the error term. This equation describes how the Mixed model is used to predict the future residual of a time series, e(t), using the past residuals of the time series, (e(t-1), e(t-2), ⋯, e(t-d)).
Similarly, the development of the optimal architecture for the Mixed model requires determination of time delays, the number of hidden neurons, and an efficient training algorithm. According to the results of the error analysis, it showed that the Mixed model with 9 neurons in the hidden layer and 3 time delays also provided the best performance (MSE = 2.29385e-3) with the LM algorithm (Mixed-LM). At the same time, the training progress using the LM algorithm for the Mixed-LM model stopped when the validation error increased for six iterations with Performance = 0.00226, Gradient = 0.000875, and Mu = 1.00e-06 at epoch 9. As illustrated in the performance plot, the best performance for the validation phase was 0.0028901 at epoch 3 for the Mixed-LM model. The results also showed a good network performance because the validation and testing errors have similar characteristics, and did not appear that any significant overfitting has occurred. The dynamic network time-series response plots were displayed in Figure 12 for the Mixed-LM model, showed that the outputs were distributed evenly on both sides of the response curve, and the errors versus time were small in the training, validation, and testing phases. The results indicated that the Mixed-LM model was able to predict the time series over the simulation period efficiently as well. For the Mixed-LM model, the correlations except for the one at zero lag, all fell approximately within the 95% confidence limits around zero, so the model was to be adequate ( Figure 13).

Conclusion
There were many studies concluded that global sea level is rising at an increasing rate. While sea level changes are a relatively slow process, therefore, understanding past sea level is important for the analysis of current and future sea level changes. Sea level changes are driven by a variety of mechanisms operating at different spatial and temporal scales [19]. Furthermore, forecasting future relative sea level changes at specific locations requires not just an estimate of global mean sea level changes, but also estimates of the different processes contributing to global mean sea level changes, as well as of the processes contributing exclusively to regional or local mean sea level changes [20]. Empirically, the SARIMA and NARNN models are good at modelling linear and nonlinear problems for the time series, respectively. However, using the Mixed model, a combination of the SARIMA and NARNN models has both linear and nonlinear modelling capabilities, can be a better choice for modelling the time series. According to the results of this study, this Mixed SARIMA-NARNN model not only can provided richer information which are important in decision making process related to the future local mean sea level impacts, but also can be employed in forecasting the future performance for local mean sea level change outcomes.