Forecasting Gold Price, using linear regression model and ARIMA
DIANE MAHAMEDOU Department of Economics, Business and Finance, Brooklyn College, 2900 Bedford Avenue Brooklyn, N.Y. 11210, USA
Instructor:Prof. Yusheng Peng
Abstract: Forecasting is a function in management to assist decision making. Forecasting arises when you need to estimate future unknown situations, such price of commodities, GDP, unemployment rate etc, for the coming period. We can’t accurately predict without referring time series estimation. Gold is a precious yellow commodity once used as money. Illegal couple years ago, now once again is accepted as a potential currency, because of the falling of dollar against the Euro and also the rising of uncertainty in our geopolitical environment. Objective of this study is to develop a forecasting model for predicting gold prices based on two currency price movements and the oil price movements. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. With the increasing demand of the Gold around the world, we have fund necessary to develop a linear regression model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the multiple linear regression (MLR) models. MLR is a study on the relationship between a single dependent variable and one or more independent variable. The fitted of MLR will be used to predict the future gold prices. Variables are defined as follow: Average Gold Prices in USD: GP Exchange rate of Euro to USD: EUROUSD Exchange rate of Japanese Yen to USD: JPYUSD Average oil Prices: OP
Introduction
Price forecasting is an integral part of economic decision making. Forecasts may be used in numerous ways; for different purposes, such as to gain from speculative activities, to determine optimal government policies, or to make business decisions. Like any other goods, goods, gold’s price depends on supply and demand. Gold is the most traded precious metal and play an important role in shaping economy. Gold is also safe haven against depreciation risk. In contrast to other commodities gold is storable and supply is accumulated over centuries. Today, gold like other commodities are predominantly quoted in U.S. Dollars. An important fact that everyone should know from 1944 to 1971, U.S dollars were convertible into gold, in order to prevent trade imbalances between countries. Then and there price of gold was fixed at $35 per troy ounce. After 1971, when the dollar convertibility into gold was cancelled, price stability of this researched commodity has vanished. Gold behaves less like a commodity but rather like long-lived assets such as stocks or bonds, gold prices are forward looking and today price depend heavily on future supply and demand. Thus, the forecast of gold price depends on the market’s psychological perception of the value of gold which in turn depends on a myriad of interrelated variables, including inflation rate, currency fluctuation and political turmoil for instance the Iraq war 2003, the subprime crisis 2008-2009. In this study we will first present the forecasting model for predicting future gold price using Multiple Linear Regression method. Then we will use a little Technical and Chart Analysis end up by a discussion point about the selected model.
Problem statement: The gold prices are time series data of gold prices fixed twice a day in London. Factors influencing gold prices are many so that for accuracy purpose we need to make a serious and selective study to ensure that that the model developed is significant otherwise we’ll end up with a biased result. It is common practice in gold trade to use London PM Fix as the factor for pricing of gold and these become the published benchmark price used by the producers, consumers, investors and central banks. In this study, we proposed the development of forecasting model for predicting future gold price using Multiple Linear Regression (MLR). The data used in this study are the Gold Prices (GP) from the London PM Fix (Noon fixing time) that we have turned into monthly Gold average Prices. GP will the single dependent variable in this model. We have chosen identified three independent variables which influence the gold prices based on our own experience as “Chartists” such as: Exchange rate of Euro to USD (EUROUSD), Exchange rate of Japanese Yen to USD (JPYUSD), Average oil Prices (OP). However, these are not the only factors influencing gold prices. The data used in this study were downloaded from couple sources from the addresses as shown below:
Now our main purpose is to determine the regression model which accurately predicts the gold price. After running the regression on R, we end up with 7 regression models. Let’s define the variables as follows: Y-GP X1-EUROUSD JPYUSD-X2 OP-X3 A first order equation is hypothesized to be: β0 + β1X1 + β2X2 + β3X3 +ε Model 1: this model included all the potential independent variables that we chosen for our study. The model obtained is: Yhat = 4212.1509 -944.6394X1 -25.8220X2 + 7.6330X3 R-squared: 0.8742; Residual standard error: 121.4; 59 degrees of freedom. Model 2: Yhat= 3875.748 + 56.071X1 -30.061X2 Multiple R-squared: 0.7196; Residual standard error: 179.7; 60 degrees of freedom. Model 3: Yhat = 3999.541 -2654.096X1 + 10.643X3 R-squared: 0.5029; Residual standard error: 239.3; 60 degrees of freedom Model4: Yhat= 3472.5478 -30.2774X2 + 5.7808X3 Multiple R-squared: 0.8388; Residual standard error: 136.3; 60 degrees of freedom Model 5: Yhat= 3446.0 -1580.5X1 R-squared: 0.1817; Residual standard error: 304.5; 61 degrees of freedom. Model 6:
Yhat= 828.244 + 5.198X3 Multiple R-squared: 0.09663; Residual standard error: 319.9, 61 degrees of freedom Model 7: Yhat= 3928.317 -29.787X2 R-squared: 0.7195; Residual standard error: 178.3; 61 degrees of freedom After assessing all the 7 models, we find out the model 1 is the best fit for gold prices forecasting, with R-squared: 0.8742, that means 87.42% of variation of the gold prices around the mean price is explained by the time series equation Yt= 4212.1509 -944.6394X1t -25.8220X2t + 7.6330X3t + εt . The table I shows the actual prices of gold compare to its predicted prices.
Analyzing the trend, we can see a perfect correlation between the current price and the expected price of gold. Our model can serve as basis for projecting the future prices of gold but unfortunately we don’t the future values of the explanatory variables. And we decide to base solely on the gold itself, to do so we are going through the ARIMA method to figure out this project, which may be a little ambitious, but once done, can serve as a guide for economic operators and even for policymakers. The chart and technical analysis may also be helpful. Analysis of Data: 1. The basic ARIMA model analysis of the historical Gold prices: To perform the basic ARIMA time series analysis on the historical stock prices, we first make a plot of the raw data, i.e. the monthly average prices of Gold over period. The plot is shown below:
This plot shows that the average prices of Gold have increased in general over the past five years. However, there is no apparent pattern in the movement of the Gold price. From period 1 to period 20 was in trading range, above the period 20(august 2007), Gold prices have moved sharply and made a high around $1800 during the period 45(September 2010), since this period the prices entered in a new congestion zone, price between $1800 and $1600. These observations tell us quickly the movement of Gold prices has different variances. We need to correct the non constant variance, to deal with the problem we are going to plot the autocorrelation function and the partial autocorrelation function of the first differences of the transformed data:
From these plots, the ACF and the PACF did not report any clear pattern. However, the plot of first differenced data and the PACF plot with decreasing Auto correlation indicated a tentative model ARIMA (1, 1, 0). The forecasts obtained for the 12 month periods are shown below:
s This graph shows two major trends the downward trend below the $1400 price line and an upward trend above the $1400 price line. So we need to do a little technical and chart analysis to make sure in what direction the Gold prices will the next coming period. 2. Technical and Chart Analysis We plot the monthly Gold prices chart that we got from investing.com showing the prices movement from 2005 up to now. Since the end of 2003 the price of gold has continued to appreciate until a major top is reached during July and August 2011 around $1900.00. The level of $1900.00 (resistance) was tested 2 times and then the market fall and entered in a congestion zone where the prices have oscillated between $1800 and $1556. The shape of the Harami (reverse pattern in technical and chart analysis) which occurred during August and September 2012 was already an important signal of reverse pattern. The shape of the Harami led to the fall of the market and the support level at 1556 which has been tested 2 times was finally broken, and the new target point is the support level around 1178.11 this could probably be reached according the ARIMA forecasting during period 74 and 75(February and march 2014).
Discussion Forecasting Prices is an important component in many economic decisions making. Forecasts may be used in numerous ways and in this study we have proposed the development of forecasting models using the multiple linear regressions (MLR). Initially, we include all the potential independent variables. In the final analysis, we concluded the model 1 is the best fit for gold prices forecasting, Yhat = 4212.1509 -944.6394X1 -25.8220X2 + 7.6330X3 where Y-GP (gold price), X1-EUROUSD (Exchange rate of Euro to USD), JPYUSD-X2(Exchange rate of Japanese Yen to USD: JPYUSD), X3- OP (Average oil Prices). This model seems to be appropriate because the amount of variance explained is about 87.42%. We use linear regression equations to forecast the dependent variable by plugging likely value of the independent variables into the estimated equations and calculating a predicted value of Y; this bases the prediction of the dependent variable on the independent variables (and on their estimated coefficients). ARIMA is an increasingly popular forecasting technique that completely ignores independent variables in making forecasts. ARIMA is a highly refined curve-fitting device that uses current and past values of the
dependent variable to produce often accurate short term forecasts of that variable. The use of ARIMA is appropriate when little or nothing is known about the dependent variable to be forecasted, when the independent variables known to be important really cannot be forecasted effectively, or when all that is needed is a one or two period forecast. ARIMA has the potential to provide short-term forecasts that are superior to more theoretically satisfying regression models. Conclusion Gold is market is expanding rapidly today because of the buoyant gold prices and the demand from the demand the countries like India and China. With the rising of gold demand, forecasting the price of gold is seen essential but difficult. The paper makes an attempt to forecast the price of gold in the short run through time series modeling using the monthly prices of gold. ARIMA associated with some technical analysis instrument could be an awesome tool for forecasting price of commodities, currencies and so on. The forecasts obtained for 12 periods ahead gave us the new target point is the support level around 1178.11 this could probably be reached according the ARIMA forecasting between period 74 and 75(February and march 2014). However we notice another method of forecasting using the Elliot wave theory associated with Gann angles which are essentially used by market technicians or chartists.
Appendix (R code) Models estimation > GOLD str(GOLD) 'data.frame': 63 obs. of 5 variables: $ periods: int 1 2 3 4 5 6 7 8 9 10 ... $ GP : num 890 922 968 910 889 ...
$ EUROUSD: num 1.47 1.48 1.55 1.58 1.56 ... $ JPYUSD : num 108 107 101 103 104 ... $ OP : num 93 95.3 105.6 112.6 125.4 ...
> summary(GOLD) periods GP EUROUSD JPYUSD
Min. : 1.0 Min. : 760.9 Min. :1.222 Min. : 76.64 1st Qu.:16.5 1st Qu.: 941.5 1st Qu.:1.308 1st Qu.: 80.78 Median :32.0 Median :1232.9 Median :1.356 Median : 89.27 Mean :32.0 Mean :1277.3 Mean :1.372 Mean : 89.00 3rd Qu.:47.5 3rd Qu.:1611.4 3rd Qu.:1.434 3rd Qu.: 94.83 Max. :63.0 Max. :1771.9 Max. :1.576 Max. :109.36 OP Min. : 39.16 1st Qu.: 76.09 Median : 87.93 Mean : 86.39 3rd Qu.: 97.20 Max. :133.93 > out1 summary(out1)
Call: lm(formula = GP ~ EUROUSD + JPYUSD + OP, data = GOLD)
Residual standard error: 319.9 on 61 degrees of freedom Multiple R-squared: 0.09663, Adjusted R-squared: 0.08182 F-statistic: 6.525 on 1 and 61 DF, p-value: 0.01315
> anova(out6) Analysis of Variance Table
Response: GP Df Sum Sq Mean Sq F value Pr(>F) OP 1 667834 667834 6.5251 0.01315 *