Multivariate Time Series in Macroeconomics

Gold is one of the most popular commodities and investment alternatives. Gold prices are thought to be influenced by several other factors such as the US Dollar, oil price, inflation rate, and stock exchange so that gold price modeling is not only influenced by its own value. This research was conducted to determine the best forecasting model and to find out what factors influence the price of gold. This research modeled the price of gold in a multivariate and reviewed the univariate modeling that will be used as a comparison model of multivariate modeling. Univariate modeling is done using ARIMA model where the modeling results state that gold price fluctuations as white noise. Multivariate gold price modeling is done using Vector Error Correction Model with gold, oil, US Dollar and Dow Jones indices, and inflation rate as predictors. The results showed that the VECM model has been able to model the gold price well and all the factors studied influenced the gold price. The US dollar and oil prices are negatively correlated with gold prices, while the inflation rate is positively correlated with gold prices. The Dow Jones index was positively correlated with gold prices in just two periods.


INTRODUCTION
Gold investment can be done in various forms such as gold bullion, gold jewelry, gold coins, gold certificates, gold company shares, and so on. Many people even almost every country in the world keeps its wealth in the form of gold. The United States in 2015 held a gold wealth of 8,133.5 tons, followed by Germany storing 3,381 tons of gold, Italy 2,451.8 tons, France 2,335.7 tons, and China depositing 1,797.5 tons of gold (Holmes, 2016). People's tendency to keep their wealth in gold is due to the long-term increase in gold prices and gold's ability to cope with inflationary turmoil (Kusuma, 2015). However, this trend does not necessarily make gold a top choice in investing. Predictions or forecasts of gold exchange rates in the future are taken into consideration before making a decision to invest.
Shankari and Manimaran (2015) found the fact that the price of gold follows the form of randam walk where this research was conducted on the price of gold specifically in India. Both studies stated that the price of gold studied could not be modeled using only the gold data itself. The price of a commodity is undeniably influenced by other factors besides its own historical value such as the price of substitution goods, inflation, economic conditions, politics, and so on. Sepanek (2014) stated that the gold price is influenced by several factors including the US Dollar and inflation rate. Another factor that contributes to the gold price is economic and political conditions (Friend of Pawnshop, 2016). Gilroy (2014) stated that there is a reverse relationship between the price of gold and world oil. The influence of these factors can be modeled scientifically using multivariates that will be discussed in this study. Previous research has modeled the price of gold univariately which only uses the price of gold in modeling. The assessment of the influence of other variables can be done by doing multivariate modeling. Multivariate modeling traces and models the influence of other variables on the price of gold so that it is believed to be able to provide more accurate results.
This research modeled the price of gold in a multivariate and reviewed gold price modeling using univariate modeling that will be used as a comparison and modeling multivariate. Multivariate modeling is done with the aim of obtaining a more accurate model and knowing the magnitude of the influence of other variables on the price of gold. Multivariate modeling of gold prices is done by including influential factors to get a better model. Factors to be analyzed in this study are the US Dollar, inflation rate, Dow Jones Index, as well as world crude oil prices.

Library Overview Model Building Strategy
Finding the right model to model time series data is not easy. A strategy in building models has been put forward very well (Box & Jenkin, 1976). The strategy is divided into three main stages, namely: model specifications, model fittings, model diagnosis (Chan & Cyrer, 2008).

Stationarity
Stationarity is a very important assumption in time series data modeling. The basis of stationerity is that the laws of probability governing the conduct of the process do not change with time. A process ( ) said strictly stationar, if the joint distribution of 1 , 2 ,… same as the joint distribution of 1− , 2− ,… − on all time options 1 , 2 ,… and time . If = 1, then the univariate distribution of equals − in each and to ( ) = ( − ) on each and k so that the mean function is constant over time. The variance of has the same value as the variance of − on each and so that the variance is said to be constant against time. The is said to be weakly stationary when the mean function is constant to time and covariance ( , − ) = covariance ( 0 , ) on each and .

Autocorrelation and Partial Autocorrelation
Autocorrelation is defined as a correlation between past and future values of a time series data that shows the linear relationship between time-separated k-lag timesheet data. The autocorrelation function is formulated as follows: (1) Partial autocorrelation measures the impact of one autocorrelation on the other autocorrelation. Partial autocorrelation of − is the autocorrelation of periods and + where the periods in which they are ignored.

Augmented Dickey Fuller Test
Augmented Dickey Fuller Test is a type of testing to test the stationary of a time series. This test is done by regressing the first difference of a time series against the first lag of the time series and − of the first difference of a time series. The regression equation can be seen in (2).
The initial hypothesis on this test is = 0 which means that the not stationer. The altenative hypothesis in this test is < 0 which means stationary (Chan & Cryer, 2008).

White Noise
A very important example of stationary process is white noise process which is defined as a series of random variables that are independent and distributed randomly identically ( ). White noise is usually assumed to have mean 0 and variance 2 . White noise is often used to produce many useful processes (Chan & Cryer, 2008).

Information Criteria
Information criteria have proven effective in choosing a statistical model. There are several criteria proposed in the time series literature where all the criteria in principle are based on likelihood and contain two components. The first component deals with the goodness of fit of a model on the data, while the second component gives a greater penalty to more complex models. Some types of information criteria are as follows: where is sample size, and are the number of parameters, ∑ is the estimated maximum likelihood of ∑ (matric covariance error ( ) and ̂ ̂′ is the residual and residual tranpose respectively on .

Univariate Time Series Moving Average Processes
Moving Average Processes is represented by ( ) where is a parameter that determines the length of the model, and is formulated as follows: where is the error term at the time . this terminology arises from the fact that is obtained by charging 1, − 1 , − 2 , ..., − on the variable , −1 , −2 , …, −1 and move and load +1 , , −1 , −2 , …, − +1 to get a value of +1 and so on. This model was first introduced Slutsky (1927) and Wold (1938), Chan & Cryer, (2008.

Autoregressive Processes
Autoregressive Processes as the name implies refers to regressions performed on itself. Specifically, a − of autoregressive process ( ) meets the equation: The value of value against the combination of the previous value is added with an "innovation" of that combines everything new at the time of that is not covered from the value in the past. Value independent of −1 , −2 , −3 and so on. This model was introduced Yule (1926) and noted as ( ) (Chan & Cryer, 2008).

Mixed Autoregressive Moving Average
Mixed Autoregressive Moving Average Model is used if it is assumed that a series consists of part autoregressive and partial moving average. The model is formulated as follows: where called mixed autoregressive moving average process and this model is noted as ( , ). one of the assumptions that must be fulfilled by this model is that the series is stationer where the series has a constant mean and variance to time and the series is invertible. was then developed to solve series problems with nonstationary mean called Integrated Autoregressive Moving Average ( ) (Chan & Cryer, 2008).

Integrated Autoregressive Moving Average (ARIMA)
is a model to model difference where = △ is an ARMA process that stationer follow the model ( , ) so that is a process ( , , ). Parameter commonly used in practice is 1 or at most 2.

Multivariate Time Series
Multivariate time series analysis reviews more than 1 variable (time series) and reviews and models the influence of each variable. Multivariate time series analysis is part of multivariate statistical analysis, but specifically deals with dependent data. The multivariate forecasting models that will be used in this study are vector autoregressive and vector error correction model.

Granger Causality
Granger Causality is a causality formulation presented by Granger (1969). The idea of Granger Causality is that if variable can be predicted better by using historical data variables and compared to historical data only, then is called Granger-Causes against . The second definition of causality expressed by Granger is that if 2 ( | ) < 2 ( | − ) (variance of predicted using universal data, is smaller than the predicted variance using all information except ), so it can be said that Granger-Causes . Feedback system occurs when variable Granger-Causes and variable Granger-Causes , noted ↔ . The Granger Causality test equation can be seen in the equation (11).
is deterministics, is random error term, j is the coefficient of and is the coefficient of . The initial hypothesis of the Granger-Causality test was = 2 = … = = 0 (non-causality). The equation (11) is called the unrestricted model where there is a lagged value of another variable ( ), while the restricted model is a model where there is only a lagged value of the dependent variable ( ). the effectiveness of granger testing is measured by minimizing Mean-Square Error, this test can be done on stationer data (Tsay, 2013).

Toda and Yamamoto Test
Toda and Yamamoto developed an alternative test of the Granger Causality test known as the Toda and Yamamoto Augmented Granger Causality (1969). This test can be used on data that is not stationer at the integration level regardless of ( ) as well as on cointegrated data or not. This test is based on the following equation: The equation (12) represents the maximum order of integration, while ℎ and are the optimal length obtained from the initial hypothesis information criteria of this test is 0 : ∑ j ℎ+ =1 = 0 or not Granger Causes . (Josheski & Bardarova, 2013)

Vector Autoregressive Model
This model is the most commonly used multivariate time series model especially in econometrics, because it is relatively easy to estimate and its properties have been widely studied in various literatures. This model is noted as ( ) and is modeled as follows: is an identical and independent array of positive random vectors with the mean 0 and covariant ∑ . This model can be done if the time series data to be used is stationer (Tsay, 2013).

Cointegration
Time series data that is not stationer can contain cointegration. Non stationer data that does not contain cointegration can be modeled using by first making a difference. Non stationer time series data containing cointegration cannot be modeled using , as it will produce spurious regression. The data can be modeled using the Vector Error Correction Model.
is called integrating when there are ( 1) vectors = ( 1 , …, )' and linear combinations of and stationer or (0) vectors. Vector is also called cointegrating vector. If ( 1) the vector is integrated, there will be 0 < < rank of cointegration. Testing of value can be done using Johansen procedure (Wang & Zivot, 2006).

Vector Error Correction Model
Time series data that are non stationer and integrated can be modeled using Vector Error Correction Model as in the equation (15).
∆ is a vector ( 1) difference of a time series data, is a matrix ( ) which is the result of multiplication between matrix ( ) adjustment coefficient ( ) with matrix ( ) vector integration . (Wang & Zivot, 2006).

Johansen Procedure
Johansen simulates a likelihood ratio ( ) statistics to determine the rank of the matrix, this test is based on the estimated value of the eigen value of the matrix. The initial hypothesis of this test is that there is no cointegration ( = 0) in the data and the alternative hypothesis of this test is that there is cointegration in the data ( > 0). The statistical value calculation of this test is calculated using the equation (16) and will then be compared with critical value. Statistical values smaller than critical value state that the initial hypothesis is accepted.
Rejection of the initial hypothesis indicates that there is a cointegration ( > 0), so the test is continued by testing whether the value = 1 and so on until the initial hypothesis is rejected (Wang & Zivot, 2006).

Diagnostic Test
Diagnostic test is a test to see if the model built is valid so that it can be used to do a forecast. A model is said to be valid if the residuals of the model are independent, identical, and normally distributed. Diagnostic test in multivariate time series model is done by doing Portmanteau test, − test, Jarque-Bera Test. Portmanteau testing has an initial residual hypothesis not serially correlated (independent).
− test aims to see if residuals have been identical to the initial residual hypothesis has been identical. Jarque-Bera test is a normality test with normal distributed residual initial hypothesis with no skewness and has a kurtosis value of 3.

Impulse Response Function
Impulse Response Function ( ) is an approach to looking at relationships between variables. Impulse response is a dynamic function to track the effect of a shock that occurs on a variable against other variables.
calculates the amount of change that will be experienced by all variables in > 0 if there is a shock change in 1 variable at the time of = 0. The assumption in calculation is ( ) = 0, because mean has no effect on the movement of the by shock. Mathematically, the purpose of irf is to find out the effect of changes on +j for > 0, where each other element does not change so that it can be assumed that = 0, = 0 for ≤ 0, and 0 = (1, 0, 0, …)'. The value , > 0 can be searched using the ( ) representation of the ( ) model (Tsay, 2013).

METHOD
The literature study process contains study activities independently in order to find information that supports research through books and journals. This process aims to find an overview of the research to be done. This process will also be the basis for determining the research gap. This process aims to find gaps to be achieved in research. Gap search is done by conducting research on related research and identifying things explored from previous researches. The process of collecting data in this research was carried out by taking the data needed in this research, namely historical gold data as well as factors that affect gold prices such as stocks, oil, US Dollar, and inflation rate. Data in the form of monthly data in the last few years. The available data is then processed so that it can be used in this research. Data analysis in this study, among others: univariate time series modeling, diagnostic test, multivariate time series modeling, measuring the performance of forecasting, and drawing conclusions.

RESULTS AND DISCUSSIONS Multivariate Time Series Modeling
Multivariate time series data modeling is a modeling that uses data from several other variables to predict the value of a variable. Multivariate modeling is done because in real life, everything must have a relationship with something else. Models of multivariate modeling used in research are Vector Autoregressive Model and Vector Error Correction Model where the selection of methods is adjusted to the characteristics of the data. Gold price multivariate modeling is gold price modeling with gold prices and other variables that are influential as predictors.
The study will try to use the U.S. Dollar index, Dow Jones Index, oil prices, and inflation rates in modeling gold prices. The variables that will be used to model the price of gold are only variables that are tested to affect the price of gold. The test will be conducted using Granger Causality test. Granger Causality testing aims to find out if the historical value of a variable can help forecasting from other variable values. The selection of variables to be tested on the modeling of a variable (in this study gold) can not be done carelessly, but rather based on the strong suspicion that the variables have an influence on the value of gold. The variables to be tested to model gold include the US Dollar index, the Dow Jones index, crude oil prices, and the inflation rate in the United States. The US dollar is believed to be a target commodity for investors as opposed to gold. The slump in the value of the US Dollar will make investors switch to buying gold so that the price of gold increases (Seputar Forex, 2016). The Dow Jones index is one of the largest stock market indices in America and the world that measures the performance of industrial components in the American stock market (Seputar Forex, 2018). Stock indexes can represent market conditions influenced by economic, political, social and so on factors. World oil prices are closely related to inflation (Jannah, 2015) where rising inflation rates will encourage investors to buy gold to maintain their purchasing power.
Granger Causality testing of these variables was conducted using the Toda and Yamamoto methods. This test is chosen because it can be used to test data that is not stationary, and can also be applied to data that integrates. This test uses Vector Autoregressive models to help prove the existence of causality where the to be used is ( + ). Parameter is obtained from information criteria such as , , etc. Parameter is the highest value of the number of differences of all variables until it is stationary. Parameter obtained by doing test on all variables to be tested.
test results can be seen in Table 1. test results show that all variables are not stationary in the initial condition ( − ≥ 0.05) so it is necessary to make a difference in order for the data to be stationary. The test results showed that all data had been stationary after a difference of 1 time ( − < 0.05). The number of differences in each variable until the variable is stationary is the same as 1 so that the value of is 1. The parameter used is obtained from the information criteria by following the Akaike Information Criterion method. The information criteria advise to use 2 so that parameter in causality testing is 2. Granger Causality testing is then done using ( + ) namely (3), this test has an initial hypothesis that the variables have no effect on gold and alternative hypothesis that the variables have an effect (granger causes) on the price of gold. The results of this test can be seen in Table 2. The test results showed that the US. Dollar index, Dow Jones index, crude oil prices, and inflation rates in America were the cause of gold prices. The test results showed that modeling by adding these variables could not make the forecasting of gold prices more accurate. The result of this test is also variable price of crude oil does not need to be discarded and can be used to model the price of gold. The next step is to determine which model to use for multivariate modeling. Modeling using Vector Autoregressive ( ) can be used if the data is stationary. Based on testing, it has been known that the variables to be modeled are not stationary, so modeling the data directly using var model can not be done, because it will cause the appearance of spurious regression. Data that is not stationary can be modeled using after making the first difference if the data does not integrate. Data that is not stationary and integrating can be modeled using the Vector Error Correction Model ( ). Cointegration testing in this study was conducted using Johansen Cointegration Test. This test aims to determine the cointegration of data and to know the number of rank ( ) that indicates the number of cointegrating vectors. The maximum rank amount is − 1 where n is the number of variables (in this study 5). The initial hypothesis in this test is = 0, where there is no cointegration and the alternative hypothesis is cointegration ( > 0). Rejection of the initial hypothesis ( = 0) will be followed by testing to determine rank (number of integrating vectors). Johansen Cointegration Test results can be seen in Table 3. The initial hypothesis is rejected if the test value is greater than critical value at a significance level of 5%. Test results showed that the data was integrated because the initial hypothesis was rejected. The test results showed that the rank count is 2 which means there are 2 cointegrating vectors that will be used to model the data using . The rank count of 2 means a linear combination of 2 vectors integrating with stationary variables. The next step is to do modeling using . Estimation of parameters in modeling is obtained using the help of software.
Gold lowered the price of gold, DJs eased the Dow Jones index, Crude lowered crude oil prices, $ dissed the U.S. Dollar index, and lowered the inflation rate in America. is a constant vector that represents the mean value, is a matrix (5 2) that represents the adjustment coefficient, is a matrix (2 5) thatots the integrating vector. Vector integration is a long-term equilibrium relationship between variables and the adjustment coefficient is the speed of correction to error in order to achieve equilibrium at a time. Γ 1 and Γ 2 are matrices (5 5) containing the autoregressive coefficients of the first and second lags. Testing of residuals and model forecasting is carried out by first transforming the model into a form. Diagnostic Tests are carried out to ascertain whether the model formed is valid and can be used to perform forecasting.
The result of converting model into with model where (3) where is ( 1) constant vector that specify mean while 1 , 2 , 3 and is a matrix (5 5) containing autoregressive coefficient at the first, second, and third . The next step is to perform diagnostic test consisting of serial correlation test, heteroscedasticity test, and normality test on residual model. Diagnostic tests are required as validation to ensure that errors from the model follow IIDN assumptions. The test results of serial correlation can be seen in Table 4.  Heteroscedasticity testing aims to determine whether the residuals produced by the model are identical or not. Residuals that are identical are called homoscedasticity, while non-identic residuals are called heteroscedasticity. The hypothesis on this test is: 0 : Residual identic (homoscedasticity) 1 : Residual is not identical (heteroscedasticity) The test results showed that residuals were identical due to the initial hypothesis received ( − > 5%). The next test is normality testing on the residual model where the results of the test can be seen in Figure 1. Testing normality on residuals using Jarque Bera Test with the following test hypothesis: 0 : Normal distributed residuals 1 : Residual is not normally distributed Normality test results showed that residual distribution was normal (initial hypothesis accepted). There are 2 other types of statistical tests on JB normality testing, namely skewness and kurtosis. A normal distribution will have a skewness value of 0 and a kurtosis of 3. The initial hypothesis on skewness testing is the skewness = 0, while the initial hypothesis of kurtosis testing is the kurtosis = 3. The test results showed that residuals from the model had a skewness value of zero, kurtosis of 3, and normal distributed residuals. Diagnostic test results show that the residuals of the model have been valid so that the model is feasible for forecasting. The results of model forecasting can be seen in Figure 2. Models that have been built using data for the period 2018.10 -2020.08. Gold price forecast in Figure 2 which is the forecast for the period 2018.10 -2020.08 which will be compared to the actual price of gold. The forecast results can be seen in the Fitted column, while the actual gold price data can be seen in the Actual column. The forecast results show a trend of increasing gold prices until August 2020, while the actual price of gold has increased in a long run. The results of forecasting using the model that has been built show that the actual value of gold price is at 95% Confidence Interval model forecasting results.
of forecast 74.2 or 6.1% of the actual gold price mean. These results show that the forecasting model has been well constructed and shows fairly accurate results. The graph of gold price forecasting results can be seen in Figure 3.

Figure 3 Gold Price Forecasting
The results of forecasting using the model that has been built are shown through the red line in Figure 3. The actual price of gold is indicated through the blue line,  Residual Actual Fitted while the lower limit and upper limit of gold price forecasting are indicated through gray and orange lines. Forecasting charts show that the forecasting results are at confidence intervals.

Impulse Response Function
Granger causality test states that Dow Jones variables, US Dollar, US Inflation, and Crude Oil help in predicting gold price, but how big and direction of influence each variable is unknown. The influence of each element can be known using Impulse Response Function ( ) analysis. Impulse Response Function calculates the amount of change that will be experienced by all variables if there is a change/shock on one variable at the time of calculation results impulse response function for 10 forward with gold price as a response can be seen in Figure 4.

Figure 4
Impulse Response Function Figure 4 shows the gold price response in the next 10 periods at each increase of 1 point by other variables and themselves.
calculations show that the current US dollar index increase of 1 point will be followed by a decline in the gold price of $6.24 per troy ounce in the next 1 month, $10.75 in the two months from now, and a decrease of $12.3 in the 10 months from now. An increase in the current inflation rate of 1% would increase the price of gold by $20.2 in the month from now and $52.8 in the next 10 months. An increase in the price of oil by 1 dollar at this time will lower the gold price by 1.73 dollars next month and 3.43 dollars 10 months from now. A 1-point increase in the Dow Jones index will cause gold prices to rise in the first and second months in the future, but will lower the price of gold after 2 months.
The results of analysis through impulse response function showed that the US Dollar index and Crude Oil price have a negative correlation to gold price, while the inflation rate has a positive correlation with gold price. The Dow Jones index has a positive correlation with gold in the first and second periods in the future, and has a negative correlation there after. The negative correlation between the US Dollar and gold indicates that the weakening US dollar will encourage economic actors to keep their money in gold in order to maintain purchasing power. The positive correlation between gold prices and inflation indicates that the value of gold is not eroded by inflation, thus supporting the statement of gold investment called able to withstand inflationary turmoil (Kusuma, 2015).

CONCLUSION
Univariate gold price modeling shows that the gold price follows the (0,1,0) model which means that fluctuations in gold prices are white noise. This fact states that the price of gold cannot be modeled using only its own historical data. The fact of this research is supported botha (1980), as well as Shankari & Manimaran (2015). Multivariate modeling to model gold prices is done using Vector Error Correction Model ( ) because the variables used are nonstationary and integrate. The results showed that gold prices were influenced by the US Dollar index, Dow Jones index, crude oil prices, and inflation rates. The US Dollar index and crude oil prices have a negative correlation to gold prices, while the inflation rate is positively correlated with the gold price. The Dow Jones index is positively correlated with gold prices only at the first and second lags.