Free Essay

A Seasonal Arima Model with Exogenous Variables (Sarimax) for Elspot Electricity Prices in Sweden

In:

Submitted By mengchex
Words 2661
Pages 11
A Seasonal ARIMA Model With Exogenous Variables (SARIMAX) for Elspot Electricity Prices in Sweden
Mengchen Xie University of Southern California
Department of Mathematics 1230 1/2 W27TH Street, 90007 Los Angeles, USA mengchenxie@gmail.com
Abstract—In a spot market, price prediction plays an indispensable role in maximizing the benefit of a producer as well as optimizing the utility of a consumer. This paper develops a seasonal ARIMA model with exogenous variables (SARIMAX) to predict day-ahead electricity prices in Elspot market, the largest day-ahead market for power trading in the world. Compared with the basic ARIMA model, SARIMAX has two distinct features: 1) A seasonal component is introduced to cope with weekly effect on price fluctuations. 2) Exogenous variables that exert influence on electricity prices are incorporated to make price predictions in the context of an integrated energy market. A detailed implementation of SARIMAX for Elspot market in Sweden is presented. Index Terms-- Seasonal ARIMA model, exogenous variables, electricity market, price prediction, time series

Claes Sandels, Kun Zhu, Lars Nordström Royal Institute of Technology
Department of Industrial Information and Control System Osquldasväg 12, 7 tr., 100 44 Stockholm,Sweden

In the Elspot market, time series models have been widely applied to make predictions on future prices. Its effectiveness has been validated by case studies in Nord Pool Spot market [2] and some of its bidding areas [3]. A. Purpose The aim of this paper is to study the Elspot price dynamics in Sweden and its interactions with relative variables (wind power, hydropower, gas and crude oil). A seasonal autoregressive integrated moving average model with exogenous variables, SARIMAX model, shall be developed to describe meaningful characteristics of price dynamics and predict the next-day price based on historical data of Elspot prices and relative variables B. Layout of the paper The sections of the paper are organized as follows: Section II gives a brief overview of SARIMAX, the time series model used in this paper. Section III describes how the model is applied to Elspot in Sweden. Section IV presents the results together with interpretations. Section V gives conclusions and discussions.

I.

INTRODUCTION

As the largest market for electrical energy in the world, measured in volume traded (TWh) and market share) [1], Nord Pool Spot operates in Norway, Denmark, Sweden, Finland, Estonia and Lithuania. More than 70% of the total consumption of electrical energy in the Nordic market is traded through Nord Pool Spot. Its market membership includes energy producers, energy intensive industries, large consumers, distributors, funds, investment companies, banks, brokers, utility companies and financial institutions. The majority of its members join in Elspot, a day-head auction where energy is traded for delivery during the next day. Orders are placed hour by hour, through Nord Pool Spot’s web-based trading system SESAM. When all orders have been submitted, equilibrium between the aggregated supply and demand is established as the system price. As the Elspot market is divided into several bidding areas, the available transmission capacity may vary and congest the flow of electrical energy between the bidding areas, and thereby different area prices are established. The system and area prices are calculated and published normally between 12:30 and 12:45 CET with a 3-minute warning.

II.

AN OVERVIEW OF SARIMAX MODEL

A. Time Series A time series is a set of values taken by a variable over time. A common notation specifying a time series X is

X = {X t : t ∈ T } where each value is recorded at a specific time t. In this paper some discrete time series are analyzed.

(1)

B. Stationarity The foundation of time series analysis is stationarity. A time series { X t } is strictly stationary if the joint distribution of

{ X t1 ,..., X tk } is identical to that of { X t1 +t ,..., X tk +t } for { X t } is weakly stationary if both the

all t. A time series

mean of X t and the covariance between X t and X t −l are time invariant, where l is an arbitrary integer. In a practical application, it is common to require data to be weakly stationary [4], which applies to the case in this paper as well. C. The ARMA Model The autoregressive model of order p, AR(p), can be defined as: X t = φ1 X t −1 + φ2 X t − 2 + ... + φ p X t − p + ε t (2) The moving average model of order q, MA (q), can be defined as: X t = ε t + θ1ε t −1 + θ 2ε t − 2 + ... + θ pε t − q (3) A sequence {ε t } is called white noise if it consists of uncorrelated random variables with zero mean and variance

F. Model with Exogenous Variables In order to improve the prediction performance of future values, another time series that is known to covariate with the data at hand can be incorporated into the time series models. The addition of an external input to a model is called using an exogenous variable in the time series modeling process. For example, the following is an ordinary ARMA model with an exogenous variable, ARMAX.

Yt = ∑ (φiYt −i ) + ∑ (θ j ε t − j ) + ∑ β h X t − h + ωt i =1 j =1 h =1

p

q

b

(8)

{ε t } is the white noise and { X t } is the exogenous variable.
III. A. Data a) Data Time Range: The price data set used to develop ARIMA model consists of 730 daily observations from 2010.01.01 to 2011.12.31. A longer span of data can be useful when analyzing season effects (for example, winter peaks). In this paper, however, it is a sufficient time span for the study. Firstly, only weekly effect is studied here, making the research free from a demanding requirement of a large amount of data. Secondly, the most recent data are usually more desirable than the earlier ones in price predictions because the latter might not be able to reflect the current price level due to inflation and other significant changes. Exogenous Variables Selection: Given the fact that the Elspot market price is directly determined by the equilibrium point of aggregated supply and demand, several exogenous variables are chosen by examing the energy production structure in Sweden [5]. The production structure is summarized in Table I. As shown in the table, hydropower and nuclear power are two primary source of energy production, followed by thermal power and wind power. Therefore, four exogenous variables are selected: hydropower production, nuclear power production, thermal power production and wind power production. Daily data of the exogenous variables are obtained from Swedish national grid [6].
Table I. Sweden Energy Production (GWh) 2010-2012 2010 3469 67411 167 55778 12525 2011 6193 67406 148 58146 9788 2012 6630 72151 85 55469 6884 SARIMAX PRICE PREDICTION MODEL FOR ELSPOT IN THE BIDDING AREA OF SWEDEN

σ 2 , denoted as {ε t } ~ WN (0, σ 2 ) . In (2) and (3), {ε t }

is

white noise. ARMA (p, q) model can be defined by combining AR (p) and MA (q) as follows: (4) X t + φ1 X t −1 + ... + φ p X t − p = ε t + θ1ε t −1 + ... + θ pε t −q D. Differencing and the Autoregressive Integrated Moving Average model The idea of transforming a nonstationary series into a stationary one by considering its change series Yt = X t − X t −1 is called differencing in the time series literature [4]. The differencing operator is defined as ∆X t = X t − X t −1 = (1 − L) X t (5) where L is the lag operator LX t = X t −1 . The differencing operator can be applied multiple times if necessary. A time series is said to be an Autoregressive Integrated Moving Average Model, ARIMA(p, d, q), if after differencing d times it follows a stationary and invertible ARMA(p, q) model, where d is called order of integration. E. The Seasonal Autoregressive Integrated Moving Average Model Based on ARIMA, SARIMA(p,d,q)(P,Q,S) model incorporates seasonality components so as to describe processes with seasonal behaviors. In this paper, weekly effect on electricity prices is studied. Note: p denotes the order of AR, q the order of MA, d the order of integration, P the order of Seasonal AR, Q the order of Seasonal MA, S the time span of repeating seasonal pattern. Seasonal AR: Seasonal MA:

b)

φ ( LS ) = 1 − φ1 LS − ... − φP LPS

(6) c) (7)

Wind Power Hydropower Gas& Diesel Nuclear Power Thermal Power

θ ( LS ) = 1 + θ1 LS + ... + θQ LQS

Data Processing:Before modeling, data have to be processed. Missing data are filled using spline interpolation and logarithm of the data is taken to make them smooth.

B. Stationarity Test Two hypothesis tests for stationarity are introduced in [7]. The most commonly used stationarity test, the KPSS test, is due to Kwiatkowski, Phillips, Schmidt and Shin (1992). It tests the null hypothesis that a time series is stationary against the alternative that it is non-stationary. The augmented Dickey-Fuller test tests the null hypothesis that a time series is non-stationary with integration order of 1 against the alternative that it is stationary, assuming that the dynamics in the data have an ARMA structure. The KPSS test and the augmented Dickey-Fuller test are conducted on every variable. The results show that each variable is not stationary until differencing is taken once. C. Seasonality- Weekly Effect Weekly effects have been observed in Elspot prices. Prices tend to be lower on weekends compared to those of weekdays. An explanation is given by intuition that most industry are closed on weekends so that the demand drops, resulting in lower prices. In time series analysis, sample autocorrelation function (ACF) and partial autocorrelation function (PACF) are used to determine order of seasonality. ACF and PACF of price differenced once (Fig.1) show spikes recurring with an interval of 7, indicating a weekly effect on prices [8]. The spikes cannot be eliminated after differencing is taken once more; indicating seasonal components should be added. Therefore, the time span of recurring seasonal patterns is set as 7.

Model Selection The Akaike Information Criterion (AIC) is used to determine the orders of AR, MA, SAR and SMA. AIC is a measure that is used to compare models to each other, the AIC rewards models for good fit and penalizes models for complexity [9]. The AIC is defined as (9). AIC=-2 ln(L) + 2k (9) Note: L is the maximized value of the likelihood function, as can be seen the AIC penalizes complexity by the number of parameters, k. In the model selection process the model with the lowest AIC value will be chosen. In this case, SARIMA(1,1,2)(2,2,7) is chosen (Table II).
Table II. AIC Value of Competing SARIMA models ARIMA (1,1,0) Seasonality (0,0,0) (1,0,7) (1,1,7) (2,1,7) (2,2,7) (2,1,0) (2,1,1) (1,1,1) (1,1,2) (2,1,2)

-381.5 -440.9 NA NA NA

-464.9 -503.3 NA NA NA

-465.9 -501.5 NA NA NA

-454.0 -514.2 -541.9 -580.4 -653.5

-475.7 -505.2 -567.4 -607.4 -666.8

-474.7 -525.9 -576.5 -608.5 -664.8

D. Regression of Exogenous Variables Exogenous variables are added as complement to explain price movements. The ARIMAX (p, d, q, b) is shown in (10). Ordinary Linear Regression (OLR) is used to measure the one-day-lagged impact of exogenous variables.

Yt = ∑ (φiYt −i ) + ∑ (θ j ε t − j ) + ∑ ( β h X h t −1 ) + ωt i j h =1

b

(10)

As shown in (10), X 1 ,..., X b are b different exogenous variables. In this paper, b=4. Exogenous variables data of the day before the aimed date of price prediction are added to the model. Further time-lagged effect of exogenous variables (refer to (8)) is ignored. IV. A. RESULTS SARIMAX Simulation Results Data of 2010.01.01-2011.12.31 are taken as historical data to fit the model. Then a SARIMAX model is established to predict Elspot prices from 2012.01.01-2012.07.01 and we get the results displayed in Fig. 2. Compared to actual prices during the same period, the predictions keep undeviating in general. B. Error Analysis The histogram of the simulation errors is shown in Fig. 3. It shows the distribution of the prediction errors approximates a normal distribution of zero mean, which is consistent with preassumptions of the model. The mean percentage error, MAPE, is a measure of accuracy in a fitted time series. It measures the mean difference in percentage between the predictions and actual values. In this case, the model has a MAPE of 1.95%. Maximum absolute percentage error (MaxMAPE) is 8.85%.

Fig. 1 Sample ACF and PACF of prices differenced once and twice, respectively

It measures the largest forecast error in percentage, which is useful in comprehending the worst-case scenario for predictions. These statistics are in the comparable range compared to ARIMA models developed by other researchers to forecast electricity prices in similar situations. For example, in [10], an ARIMA model is developed to forecast day-ahead prices from EPEX exchange and its MAPE is 2.38% and MaxMAPE is 14.74%.

V.

CONCLUSION AND DISCUSSION

The SARIMAX models have a satisfactory performance. Further work aimed at improving model accuracy and practicability could be done from the following aspects. In this paper, the SARIMAX model is developed on the basis of historical data from 2010.01.01 to 2011.12.31 to predict prices from 2012.01.01 to 2012.07.01. In practice, model parameters should be updated periodically so as to timely keep up with the trend of price movements. Seasonality could be analyzed in a more comprehensive way given a longer time span of data. Limited by the number of observations, the sample ACF/PACF merely shows an indicator of weekly cycle. Were more observations added, yearly seasonality would be much more obvious. SARIMAX model has its inherent limitations. An obvious one is that it solely deals with instantaneous impacts of exogenous variables without taking time-lagged effects into consideration. However, time-lagged effects are inevitable because it usually takes time for a market to react to external changes. A vector autoregressive model could be introduced to deal with delayed impacts. Furthermore, a vector error correction model could be developed on the basis of vector autoregressive if long-term equilibriums exist among level stationary variables [11]. REFERENCES
[1] [2] http://www.nordpoolspot.com/TAS/Day-ahead-market-Elspot J. Lindberg, "A Time Series Forecast of the Electrical Spot Price," Umeå Universitet, Sep. 2010. [3] A. Løland, X. K. Dimakos, "Modeling Nord Pool’s NO1 area price," The Journal of Energy Markets, vol 3, pp.73-92, Spring 2010. [4] R. Tsay, Analysis of Financial Time Series. (3rd ed). New Jersey: Wiley, 2010, p. 81. [5] Swedish Energy Agency, Energy in Sweden. [Online].Available: http://www.energimyndigheten.se/en/ [6] Svenska Kraftnät http://svk.se/Energimarknaden/El/Statistik/Elstatistikfor-hela-Sverige/ [7] E.Zivot, J. Wang, "Modeling Financial Time Series with S-Plus," New York: Springer-Verlag, 2003 [8] R. Nau, "Lecture Notes for Statistical Forecasting," [Online]. Available: http://people.duke.edu/~rnau/seasarim.htm [9] S. Ng, Pierre Perron, "A Note on the Selection of Time Series Models," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 67(1), pp. 115-134, 2005 [10] T. Jacasa, I. Androcec, P. Sprcic, "Electricity price forecasting-ARIMA model approach," 8th International Conference on the European Energy Market, May 2011 [11] K.C. Pradhan, K. S. Bhat, "An Empirical Analysis of Price Discovery, Causality and Forecasting in the Nifty Futures Markets," International Research Journal of Finance and Economics, Issue 26, 2009

Fig. 2. Actual prices and SARIMAX predictions

Fig. 3. Histogram of prediction errors

Similar Documents