 Review
 Open access
 Published:
VAR, ARIMAX and ARIMA models for nowcasting unemployment rate in Ghana using Google trends
Journal of Electrical Systems and Information Technology volume 10, Article number: 12 (2023)
Abstract
The analysis of the high volume of data spawned by web search engines on a daily basis allows scholars to scrutinize the relation between the user’s search preferences and impending facts. This study can be used in a variety of economics contexts. The purpose of this study is to determine whether it is possible to anticipate the unemployment rate by examining behavior. The method uses a crosscorrelation technique to combine data from Google Trends with the World Bank's unemployment rate. The Autoregressive Integrated Moving Average (ARIMA), Autoregressive Integrated Moving Average with eXogenous variables (ARIMAX) and Vector Autoregression (VAR) models for unemployment rate prediction are fit using the analyzed data. The models were assessed with the various evaluation metrics of mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), median absolute error (MedAE), and maximum error (ME). The average outcome of the various evaluation metrics proved the significant performance of the models. The ARIMA (MSE = 0.26, RMSE = 0.38, MAE = 0.30, MAPE = 7.07, MedAE = 0.25, ME = 0.77), ARIMAX (MSE = 0.22, RMSE = 0.25, MAE = 0.29, MAPE = 6.94, MedAE = 0.25, ME = 0.75), and VAR (MSE = 0.09, RMSE = 0.09, MAE = 0.20, MAPE = 4.65, MedAE = 0.20, ME = 0.42) achieved significant error margins. The outcome demonstrates that Google Trends estimators improved error reduction across the board when compared to model without them.
Introduction
The vast amount of information provided by the internet such as Google [1, 2], Twitter [3], social media [4], or combinations of webbased data sources [5, 6] have necessitated its numerously used in recent decades to find the potential of digital information for predictions in a wide range of sectors. Study reviews that Google handles over 92% of all online search requests in the world [7], and has demonstrated to be valid [8], valuable [9], accurate [10], and beneficial [11] for predictions. Google Trends has proven to be a dependable source of trend data for online searches and it is being extensively used by researchers around the world mostly for a realtime prediction of macroeconomic trends [12, 13].
Information that people provide through the internet describes the current state of the people and offers a good understanding mostly of the economic processes, particularly unemployment [14, 15]. Upon all these useful online sources with all the availability of highfrequency data and recent technological advancement, statistical information published on unemployment by nations is released with delays and may still be revised [16, 17]. The way of gathering data for unemployment estimation seems exasperating making it impossible to know how the economy is performing right now but only how it was several months or years ago. This challenge is almost common in all countries, with Ghana not an exemption. This results in Policymakers making assessments in realtime using inadequate information, and knowing the present unemployment state which could help them better understand whether an economy is contracting or expanding and respond [18]. This paper tackles the case by using realtime Google trends data for prediction of unemployment claims in Ghana.
According to the Ghana Statistical Service's most recent census, Ghana's UnEmployment Rate (UER) increased to 13.4% in 2021, up from 6% in 2010, with 32.8% of Ghanaians aged 15 to 24 unemployed. Ghana faces a desperate downturn in economy, and the economy robust growth over the last two decades has not converted into job creation or improved employment circumstances [19]. This unfortunate situation and pressure on jobs have resulted in the loss of hundreds of jobs [20]. It would be of communal interest to produce realtime estimates of the unemployment rate to help policy making to produce realtime unemployment rate. The novelty of this paper is as follows:

this is the current paper that considers the use of ARIMA, ARIMAX, and VAR in predicting unemployment rate in Ghana.

the paper considers Google Trends indicators to predict unemployment rate in Ghana, which in turn can be used for the West African sub region.

the paper is the current to consider unemployment rate predictions in the literature.

the current paper provides the strategies and benchmarks for governments, agencies and organizations to make informed decisions on unemployment in Ghana, Africa, and the world as a whole.
The rest of the paper is organized as follows: The next section discusses related literature on forecasting using online search data. Section “Methodology” describes the methodology used for identifying a large number of keywords that may help in the prediction of unemployment claims, also provides a brief overview of the models used for comparison of results. The results of the models are discussed in section “Results and discussion.” Section “Conclusion and future works” gives the conclusion and discusses the importance of using different categories of keywords for the prediction of the unemployment claims.
Related works
Online search engines are frequently used for realtime research. Due to the huge amount of daily search queries, Ettredge et al. [21] took the first initiative by first looking into how realtime forecasting may be done by using the Internet and the study's findings reveal a strong link between Internetrelated web search activity and unemployment rate in the USA [22, 23] continued by looking at how web search data, particularly Google, could be utilized to improve forecasting of a range of economic parameters, such as jobless claims, retail sales, real estate demand, and vacation destination preferences. Several studies of realtime forecasting utilizing internet data, particularly Google Trends (GT) data, have been published since these papers, but this work focuses on unemployment prediction.
To anticipate UERs during the COVID19 pandemic in Indonesia, Rizky et al. [2] used GT data query share for the keyword "phk" (work termination) and earlier series from the official labor force survey performed by Badan Pusat Statistik (Statistics Indonesia). As a result of using the GT index query as an exogenous variable to capture current conditions of a phenomenon that is occurring, results of predicting open UER using ARIMAX during the COVID19 period generate forecast values that are reliable and near to reality. Petropoulos et al. [24] used text mining algorithms to develop a financial lexicon based on a collection of 10,000 Central Bank speeches. Google inquiries, according to experts, can predict future market volatility in a short time (one month). Tuhkuri [25] used the ETLAnow model and no Google search data to estimate official UER in the European Union (EU)  28 countries. Google Inc.'s Google Trends database, as well as Eurostat's Labor Force Statistics, are the model's primary data sources. Findings suggest that Google searches are linked to the EU UER, even after controlling for countrylevel, delayed, and seasonal effects.
Tuhkuri [26] used GT's database from Google Inc. and Labor Force Statistics from the Current Population Survey and US Bureau of Labor Statistics. Results reveal that Google searches' predictive ability is inadequate for shortterm forecasting, that the utility of Google data for forecasting purpose is occasional, and forecasting accuracy increases are relatively modest. Mulero and GarcíaHiernaux [1] used data from GT and the Spanish State Employment Service to examine a large number of potential explanatory factors for UERs. The results reveal an increase in expected accuracy of 10% to 25%.
Lasso and Snijders [27] adopted GT method to forecast Brazil's UER. The findings reveal that Google search volumes for jobrelated phrases have significant predictive power, with biweekly search data forecasting the direction of the UER with over 80% accuracy, exceeding baseline methodologies based on seasonal trends by over 15%. Brake and Ramos [28] estimate the UER in the Netherlands using a variable based on the amount of Google search keywords. The predictive capability of the Google Indicator is determined by comparing the accuracy of a benchmark model to an upgraded model with the Google Indicator. According to the statistics, the Google improved models produce up to 27.8% more accurate estimations when considering a onemonth forecast horizon.
Simionescu and Zimmermann [14] looked into how internet usage information is used in various industries, with unemployment modeling being a particular area of interest. The results of the research show that there is a lot of potentials that should be investigated further. A vast majority of nations base their unemployment estimation and modeling on internet data. However, the forecast's accuracy is based on each country's internet penetration, the age distribution of online users, and the stability of the generated internet variables. Maas [29] studied if Google search data, and other more traditional predictor elements, may be utilized to anticipate the UER in the USA. The findings indicate that GT forecasting methods proposed in this study are most beneficial in short term.
Jung and Hwang [30] constructed unemployment prediction models for specific age groups using Google search queries related to them (the 30s and 40s) and known unemployment statistics from Statistics Korea. The findings demonstrate that employing a web search query to improve unemployment prediction models for Korea is still useful. Smit [31] investigates whether and to what extent Google search data may be utilized to forecast the US UER. They concluded that GTs enhances the anticipated accuracy of all currently used forecasting approaches.
Methodology
The study explored the effectiveness of the Google trends by adopting several testing techniques. Figure 1 displays a detailed procedure for the experiment. The steps below are a detailed explanation of Fig. 1.

1.
To start, data from GT were joint with interpolated World Bank (WB) UER data to create a single, special dataset for the visualization and study of UER in Ghana using Granger causality.

2.
Time Series (TS) data are split into training and test sets after input.

3.
Training sets and test sets were used to train and evaluate the models (ARIMA, ARIMAX, and VAR).
Data
The World Bank (WB) and Google Trends (GT) provided the data for this collection. Google launched the website Google Trends for search analysis in 2006. GT offers a search trend that starts with the year 2004 and shows the frequency with which a certain search phrase is entered into Google's search engine over time about the site's overall search traffic.
GT shows changes in internet interest for any TS in any nation or location over a selected period of time, such as one year, several years, four months, three weeks, thirty days, seven days, four hours, or one hour. Additionally, several sentences from various places can be compared simultaneously. The GT and World Bank data can be downloaded in ".csv" format. In short, GT calculates the number of searches represented mathematically in equation 1, 2, and 3 as follows [32]:
where i = Terms or expressions of the study, k = possible terms to search on Google, m = months of the study. Additionally, \(S(e{)}_{\mathrm{tot},m}\) = total search on Google for onemonth m in a particular country, \(S(e{)}_{i,m}\) = total searches on Google for a term i of our study for a month m and a country, \(Qs(e{)}_{t,m}\) = Query share of a term in a certain month and country, and \(\mathrm{RSV}(e{)}_{t,m}\) = Relative search volume of a term in a certain month and country.
Our sample of search terms comprises 50 Google Trends which have been chosen based on the methodology as shown in Table 1. Our data window is restricted to begin with 2010–2020  since this is the earliest data point for which Ghanaian migrated to using internet. The variable of interest is the unemployment rate date for Ghana downloaded from the World Bank website.
Interpolation
For extracting highfrequency data (such as monthly or weekly data) from lowfrequency data (such as annual data), the ChowLin approach, a disaggregation method, is utilized [33]. The method makes sure that the highfrequency series' average, first, and last values correspond to those of the lowfrequency series. The following twostep additive structure is the general temporal disaggregation framework for developing a highfrequency estimate, according to [33]. Equation 4 describes the ChowLin approach.
Make a preliminary highfrequency series \({\overline{\upsilon }}_{j}\) using auxiliary data from several indicator series. To incorporate this data, a generalized least squares regression strategy is frequently utilized. Analyze the differences in residuals between the observed lowfrequency series and the highfrequency series that have been aggregated to the lowfrequency scale (through the matrix \(H\in {f}^{n\times m}\)). Then, create a temporally consistent highfrequency version \({y}_{i}\) by distributing these differences among the highfrequency periods using the distribution matrix \(F\in {R}^{n\times m}\).
Causality (granger causality (GC))
GC test examines the connection between the current value of one variable and the historical values of another variable to find a causal direction between two or more time series [34]. According to [35] GC indexes of two series Y and X can be computed by finding the variance of the error samples. If X and Y are independent, then X(\(var(\varepsilon )\)) = Y(\(var(\varepsilon )\)), where \(var(\varepsilon )\) denotes the variance of the error e. Otherwise, the two equations do not hold. For example, if X is the cause of Y, then X (\(v\mathit{ar}(\varepsilon )\)) > Y (\(v\mathit{ar}(\varepsilon )\)). It can be represented by the formula in Eq. 5 [36]
If \({F}_{(X\to Y)}\) ≥ 0 and \({F}_{(Y\to X)}\) ≥ 0 then the indexes of causality can be analyzed. Specifically, if \({F}_{(X\to Y)}\) >\({F}_{(Y\to X)}\), then X is the cause of Y, or the information flowing from X to Y is more than that from Y to X; if \({F}_{(X\to Y)}\) <\({F}_{(Y\to X)}\), then Y is the cause of X.
Training, and test
The overall data set was split into training and test data sets with the shares close to 80% from 2010 to 2018 dataset, with the remaining 20% from 2019 to 2020 designated for testing. Table 2 shows specific splitting procedure that divides the dataset. In the second step, the test set of two years frames is further divided into yearly (Y1), halfyear, quartile, and monthly such that UER was tested in the different time frames.
Models
The data science project of TS forecasting is crucial for many processes that happen over time. TS forecasting is a practical method for figuring out how past data influence present results. Making short and longterm projections and patternspotting using previous data allows for this. The TS used were ARIMA, VAR and ARIMAX.
VAR
VAR is a forecasting method that can be used when two or more TS interact. In other words, the TS in question has a twoway relationship. VAR models can be used to assess and predict multivariate TS data, which sets them apart from univariate autoregressive models. VAR models are often used in economics. For a VAR model with a large number of interconnected TS variables. Equation 6 represents the VAR model
where the c is the intercept, \(\phi\) coefficient of lags of y till order p, and ɛ error. Here, it is shown as a system of equations with one equation per TS variable. VAR is adaptable, requires less time and information [37], and makes it simple to integrate additional data [38]. VAR models, however, have the drawback of being unable to take into account when the measure of the dispersion between numbers in a data set changes across various time series values [39].
ARIMA
ARIMA combines the ideas of autoregression and moving average to provide forecasts that are linear combinations of previous variable values and forecast errors. ARIMA is characterized by three factors: p, d, and q signify the number of lagged (or previous) data to consider for autoregression, the number of times the raw observations are differenced, as well as size of the moving average window, respectively.
The forecasting equation is structured in Eq. 7 as follows:
where \({F}_{t}\) = forecast point at time t, \({L}_{t}\) = Level at time t (straight line approximation of all your data at one time point—calculated in ARIMA, it uses the mean of differenced data time smoothing constants), \({D}_{tp}^{`}\) = Previous difference observed data points, \({E}_{tq}\)= Error in prediction on previous data points, and \(\Omega\) and β are smoothing constants.
Many scholars who used time series recently explored ARIMA. However, the ARIMA model only applies to one variable, does not adequately describe some data turning points, and cannot adequately convey relationships between variables [40, 41]. As a result, it is insufficient to describe genuine issues.
ARIMAX
The ARIMAX model is an extension of the ARIMA model. The model includes other independent variables that are the X added to the end and stands for “exogenous variables.” This involves adding a separate different outside variable to help measure our endogenous variable.
Equation 8 is structured as follows:
where Pt and Pt−1 represent the values in the current period and 1 period ago, respectively. Similarly, ϵt and ϵt−1 are the error terms for the same two periods. C is just a baseline constant. ϕ1 and θ1, express what parts of the value Pt−1 and error ϵt−1 last period are relevant in estimating the current one. β is a coefficient which will be estimated based on the model selection and the data. X is the exogenous variable of interest. ARIMAX is helpful since it combines the time series and regression components into one model. However, it can be challenging to interpret the independent variable that may have an impact on the result.
Evaluation metrics
We compute the mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), median absolute error (MedAE), and maximum error (ME) to assess the forecasting accuracy of each model. Equations 9, 11, 12, 13, 14, and 14 represent the aforementioned evaluation metrics.
where y denotes current UER and \(\widehat{y}\) is expected UER. Our study used six (6) different valuation metrics to evaluate the models. By employing more evaluation metrics, we were able to choose the optimum strategy while also confirming that each model was able to complete the underlying predicting task.
Results and discussion
Interpolation
The basic goal of temporal disaggregation methods is to create a new TS while preserving the shortterm behavior of higher frequency indicator series. This TS must be coherent with lowfrequency data. For the UER and interpolated UER in question, standard descriptive data are provided in Table 3. The table demonstrates that the UER and the interpolated UER are nearly equal. Visual representations of the UER and interpolated UER are shown in Fig. 2.
The graph shows that the interpolated unemployment rate, which comes from a dataset of 574 recordings, and the actual unemployment rate, which comes from a dataset of 11 records, both vary in the same way over time, proving that our dataset is equal in mean and standard deviation.
Crosscorrelation function (CCF) analysis
Table 4 outlines the keywords whose trends were highly linked with UER using Granger causality Test (GCT). We compiled a list of terms from Table 4 with a high p for lag << 1 0.05 that are associated with the UER.
The table demonstrates how 14 of the 50 GT (x1 to 50) estimators for Ghana are related to the WB UERs series (y). The cells in the table with p(v) values that are less than 0.05 for the first lag were chosen. Figure 3 displays a graph of GT estimators with p0.05 analysis results. The graph shows that there is a range of correlations between +1 and 1, where +1 represents the total positive correlation, 0 represents the absence of any correlation, and 1 represents the total negative correlation. The lags and past values of the 14 indicators are statistically significant in the equation and predicting the future values of unemployment rate.
Model result
According to the experimental design aforementioned, detailed experiments with different TS models were conducted using univariate or multivariate models. The models and order utilized in building series are ARIMA (1, 2, 1), ARIMAX (4, 1, 3) and VAX (3, 0). Table 5 illustrates the data evaluation metrics for the Models.
Evaluation of the models
The selected significant prospective determinants of the unemployment rate are taken into account with the aid of various evaluation metrics. Consideration was given to the significant y chosen for the unemployment rate in all periods. Table 5 provides an overview of the performance metrics MSE, RMSE, MAE, MAPE, MedAE, and ME for all the periods. The results show that over the first five measurement periods, the model was able to forecast with little error. Additionally, for all models, the error margin rises as the anticipated period grows. Furthermore, for nearly all periods and virtually all evaluation techniques, VAR was able to forecast with minimum error.
We created an average based on each evaluation metric results for all models, as shown in Table 5, to decide and choose the best models for the forecast. The VAR model had the best and least average error values, with MSE = 0.09, RMSE = 0.09, MAE = 0.20, MAPE = 4.65, MedAE = 0.20, and ME = 0.42, as demonstrated by the average findings in Fig. 4. This demonstrated how better the proposed model VAR (multivariate TS) with GT estimators is compared to ARIMA and ARIMX. The VAR was able to detect a minor growth even if the models did not follow the major trend of UER change. The graph demonstrated how much better and more effective the VAR model is than the other models.
Figure 5 shows the actual UER for Ghana as well as the predicted visualization for each of the models over the twoyear timeframe. Except for VAR, which is somewhat in line and reflected the modest shift, all models were not in line with the UER, according to the figure. The VAR Model outperforms all other models (ARIMA and ARIMAX). Most models in economic condition approximation perform well in a stable environment, but they lack the prudence to foresee hidden economic change. In both steady and dynamic settings, the VAR Model linking input factors derived from rich highfrequency timely variables for predicting UER perform better.
Conclusion and future works
The issue is not a dearth of data, but rather a dearth of information that can be used for planning, strategy, and decisionmaking. Using big data, such as Google Trends, can assist the entire government system. Google Trends provides access to a huge unfiltered collection of actual Google search requests. People use Google for a wide range of informational and topical searches, making it a valuable search engine. 50 words or phrases were of interest. Google Trends (GT) search query data were used to derive values for search relating to Jobs, society, social services, and economic indicators. The study identified a number of factors that influence the unemployment rate, including "how to make money," "how to start a business," "jobs in Ghana," "jobs in the USA," "online money," "nurse application," "visa application," and "police recruiting." This study proposes a technique to first implementing preprocessing to overcome the difficulty of handling the vast data and describes an indepth look into the use of ARIMA, ARIMAX and VAR in nowcasting unemployment in Ghana as a usecase.
In terms of prediction accuracy, error margin, and model reliability, results show that the VAR method surpassed all other techniques. VAR (MSE = 0.09, RMSE = 0.09, MAE = 0.20, MAPE = 4.65, MedAE = 0.20, ME = 0.42) achieved significant error margins. This is compelling evidence that realtime UER forecasting at a daily level of generality is possible. Most models in economic condition approximation perform well in a stable environment, but they lack the prudence to foresee hidden economic change. In both steady and dynamic settings, the VAR Model linking input factors derived from rich highfrequency timely variables for predicting UER perform better. The objective of successful citizen care management can be attained with the use of Google Trends by offering effective datadriven services to citizens and predicting their needs based on the analysis of surveys taken among various groups of citizens. In future, more data will be collected to train with artificial intelligence techniques to generate decision support systems.
In the current study, we have highlighted a few predictor variables that contribute to the nation's unemployment rate and are crucial in figuring out unemployment. The government can also use this study's crucial information to make datadriven decisions. The government will be assisted in strengthening technical and vocational institutions. These will then bring in revenue and be put toward development. Additionally, it will be useful in establishing the state of the economy while formulating monetary policy. We recommend using machine learning model for future work.
Availability data and materials
The data presented in this study are publicly available through the Fig Share repository via Afrifa, Stephen (2022): unemployment_data.csv. figshare. Dataset. https://doi.org/10.6084/m9.figshare.20311167.v1.
References
Mulero R, GarcíaHiernaux A (2021) Forecasting Spanish unemployment with Google Trends and dimension reduction techniques. SERIEs 12(3):329–349. https://doi.org/10.1007/s1320902100231x
Rizky O, Fajar M, Prasetyo OR, Nonalisa S (2020) Forecasting unemployment rate in the time of COVID19 pandemic using Google Trends Data (Case of Indonesia). Munich Pers. RePEc Arch, no. 105042
Nirmala CR, Roopa GM, Kumar KRN (2015) Twitter data analysis for unemployment crisis. In: Proceedings of 2015 international conference applications theoretical computer communications and technology. iCATccT 2015, pp 420–423. https://doi.org/10.1109/ICATCCT.2015.7456920
Ryu PM (2018) Predicting the unemployment rate using social media analysis. J Inf Process Syst 14(4):904–915. https://doi.org/10.3745/JIPS.04.0079
Mavragani A, Ochoa G, Tsagarakis KP (2018) Assessing the methods, tools, and statistical approaches in Google trends research: Systematic review. J Med Internet Res 20(11):1–20. https://doi.org/10.2196/jmir.9366
Twumasi E, Frimpong EA, Kwegyir D, Folitse D (2021) Improvement of grey system model using particle swarm optimization. J Electr Syst Inf Technol. https://doi.org/10.1186/s43067021000369
Naccarato A, Falorsi S, Loriga S, Pierini A (2018) Combining official and Google Trends data to forecast the Italian youth unemployment rate. Technol Forecast Soc Change 130:114–122
McCallum ML, Bury GW (2014) Public interest in the environment is falling: a response to Ficetola (2013). Biodivers Conserv 23(4):1057–1062
Jun SP, Park DH (2016) Consumer information search behavior and purchasing decisions: empirical evidence from Korea. Technol Forecast Soc Change 107:97–111. https://doi.org/10.1016/j.techfore.2016.03.021
Han SC, Chung H, Kang BH (2012) It is time to prepare for the future: forecasting social trends. In: Kim Th, Ma J, Fang Wc, Zhang Y, Cuzzocrea A (eds) Computer applications for database, education, and ubiquitous computing. EL DTA 2012. Communicat. Springer, Berlin, Heidelberg. https://doi.org/10.1007/9783642356032_48.
Vosen S, Schmidt T (2011) Forecasting private consumption: Surveybased indicators vs. Google trends. J Forecast 30(6):565–578. https://doi.org/10.1002/for.1213
Kundu S, Singhania R (2020) Forecasting the United States unemployment rate by using recurrent neural networks with Google Trends data. 11(6). https://doi.org/10.18178/ijtef.2020.11.6.679
Heidary J, Rastegar H (2022) A novel computational technique using coefficient diagram method for load frequency control in an interconnected power system. J Electr Syst Inf Technol 9(1):1–24. https://doi.org/10.1186/s43067022000621
Simionescu M, Zimmermann KF (2017) “Big Data and Unemployment Analysis,” GLO Discuss. Pap., p. No. 81
Hacıevliyagil N, Drachal K, Eksi IH (2022) Predicting house prices using DMA method: evidence from Turkey. Economies 10(3):1–27. https://doi.org/10.3390/economies10030064
Naccarato A, Pierini A, Falorsi S (2015) Using Google Trend data to predict the Italian unemployment rate. Dep. Work. Pap. Econ.  Univ. “Roma Tre
Junior MA, Appiahene P, Appiah O (2022) Forex market forecasting with two  layer stacked Long Short  Term Memory neural network ( LSTM ) and correlation analysis. J Electr Syst Inf Technol 1:1–24. https://doi.org/10.1186/s43067022000541
Simionescu M, CifuentesFaura J (2022) Forecasting National and Regional Youth Unemployment in Spain Using Google Trends. Soc Indic Res 164(3):1187–1216. https://doi.org/10.1007/s11205022029849
Simionescu M, CifuentesFaura J (2022) Can unemployment forecasts based on Google Trends help government design better policies? An investigation based on Spain and Portugal. J Policy Model 44(1):1–21. https://doi.org/10.1016/j.jpolmod.2021.09.011
Şentürk G (2022) Can Google search data improve the unemployment rate forecasting model? AN empirical analysis for Turkey. J Econ Policy Res 9(2):229–244. https://doi.org/10.26650/jepr963438
Ettredge M, Gerdes J, Karuga G (2005) Using webbased search data to predict macroeconomic statistics. Commun ACM 48(11):87–92. https://doi.org/10.1145/1096000.1096010
Choi H, Varian H (2009) Predicting the present with Google Trends. Tech. report, Google. [Cited 1 April 2012.]
Choi H, Varian H (2009) Predicting initial claims for unemployment insurance using Google Trends. Tech. report, Google. [Cited 1 April 2012.]
Petropoulos A, Siakoulis V, Stavroulakis E, Lazaris P, Vlachogiannakis N (2021) Employing Google Trends and deep learning in forecasting financial market turbulence. J Behav Financ. https://doi.org/10.1080/15427560.2021.1913160
Tuhkuri J (2016) ETLAnow: a model for forecasting with Big Data forecasting unemployment with Google Searches. ETLA Reports 54, no. 54, p 20
Tuhkuri J (2016) Forecasting unemployment with Google Searches. ETLA Work. Pap. No 35
Lasso F, Snijders S (2016) The power of Google search data2 an alternative approach to the measurement of unemployment in Brazil
te Brake G, Ramos R (2017) Unemployment ? Google it ! Analyzing the usability of Google queries in order to predict unemployment
Maas B (2019) Shortterm forecasting of the US unemployment rate. J Forecast. https://doi.org/10.1002/for.2630
Jung JU, Hwang J (2019) Application of Google Search queries for predicting the unemployment rate for Koreans in their 30s and 40s. 17(9):135–145
A. O. O. Smit (2018) Unemployment rate forecasting using Google trends, Bachelor Thesis in Econometrics & Operations Research erasmus university rotterdam erasmus school of economics, pp 1–22
Jimenez A, SantedGermán MA, Ramos V (2020) Google Searches and Suicide Rates in Spain, 2004–2013: Correlation Study. JMIR Public Heal Surveill 6(2):2004–2013. https://doi.org/10.2196/10919
Mosley L, Eckley I, Gibberd A (2021) Sparse temporal disaggregation, no. 2019, pp 1–33
Ghouali S et al (2017) The granger causality effect between cardiorespiratory hemodynamic signals to cite this version : HAL Id : hal01573108 The Granger Causality Effect between. https://doi.org/10.5176/22511911
Chen B, Ma R, Yu S, Du S, Qin J (2019) Granger causality analysis based on quantized minimum error entropy criterion. IEEE Signal Process Lett 26(2):347–351. https://doi.org/10.1109/LSP.2019.2890973
Bressler SL, Seth AK (2011) Wiener–Granger causality: a well established methodology. Neuroimage 58(2):323–329. https://doi.org/10.1016/j.neuroimage.2010.02.059
Bai P, Safikhani A, Michailidis G (2022) Multiple change point detection in reduced rank high dimensional vector autoregressive models. J Am Stat Assoc. https://doi.org/10.1080/01621459.2022.2079514
Odekina GO, Adedotun AF, Imaga OF (2022) Modeling and forecasting the third wave of Covid19 incidence rate in Nigeria using vector autoregressive model approach. J Niger Soc Phys Sci 4(1):117–122. https://doi.org/10.46481/jnsps.2022.431
Cho H, Maeng H, Eckley IA, Fearnhead P (2022) Highdimensional time series segmentation via factoradjusted vector autoregressive modelling, pp 1–62
VictorEdema UA, Essi PID (2016) Autoregressive integrated moving average with exogenous variable (ARIMAX ) model for Nigerian Non Oil Export 8(2014):2010–2015
Yucesan M, Gul M, Celik E (2018) Performance comparison between ARIMAX , ANN and ARIMAXANN hybridization in sales forecasting for furniture industry. RES Gate. https://doi.org/10.5552/drind.2018.1770
Acknowledgements
We express our sincere gratitude to Mrs. Nancy Addia who encouraged and motivated us throughout the research. Finally, we would like to thank Google and World Bank, for making the data available.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
“Conceptualization, WKA and PA; methodology, SA and WKA; software, WKA.; validation, WKA, PA and SA; formal analysis, WKA; investigation, WKA, PA and SA; resources, PA; data curation, WKA; writing—original draft preparation, WKA; writing—review and editing, PA; visualization, WKA; supervision, PA; project administration, PA; funding acquisition, PA. All authors have read and agreed to the published version of the manuscript.”
Corresponding author
Ethics declarations
Competing interests
Competing interest statement declared by the corresponding author on behalf of all authors. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Adu, W.K., Appiahene, P. & Afrifa, S. VAR, ARIMAX and ARIMA models for nowcasting unemployment rate in Ghana using Google trends. Journal of Electrical Systems and Inf Technol 10, 12 (2023). https://doi.org/10.1186/s43067023000781
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s43067023000781