Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis

Ayitey Junior, Michael; Appiahene, Peter; Appiah, Obed

doi:10.1186/s43067-022-00054-1

Research
Open access
Published: 30 June 2022

Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis

Journal of Electrical Systems and Information Technology volume 9, Article number: 14 (2022) Cite this article

5582 Accesses
6 Citations
Metrics details

Abstract

Since it is one of the world's most significant financial markets, the foreign exchange (Forex) market has attracted a large number of investors. Accurately anticipating the forex trend has remained a popular but difficult issue to aid Forex traders' trading decisions. It is always a question of how precise a Forex prediction can be because of the market's tremendous complexity. The fast advancement of machine learning in recent decades has allowed artificial neural networks to be effectively adapted to several areas, including the Forex market. As a result, a slew of research articles aimed at improving the accuracy of currency forecasting has been released. The Long Short-Term Memory (LSTM) neural network, which is a special kind of artificial neural network developed exclusively for time series data analysis, is frequently used. Due to its high learning capacity, the LSTM neural network is increasingly being utilized to predict advanced Forex trading based on previous data. This model, on the other hand, can be improved by stacking it. The goal of this study is to choose a dataset using the Hurst exponent, then use a two-layer stacked Long Short-Term Memory (TLS-LSTM) neural network to forecast the trend and conduct a correlation analysis. The Hurst exponent (h) was used to determine the predictability of the Australian Dollar and United States Dollar (AUD/USD) dataset. TLS-LSTM algorithm is presented to improve the accuracy of Forex trend prediction of Australian Dollar and United States Dollar (AUD/USD). A correlation study was performed between the AUD/USD, the Euro and the Australian Dollar (EUR/AUD), and the Australian Dollar and the Japanese Yen (AUD/JPY) to see how AUD/USD movement affects EUR/AUD and AUD/JPY. The model was compared with Single-Layer Long Short-Term (SL-LSTM), Multilayer Perceptron (MLP), and Complete Ensemble Empirical Mode Decomposition with Adaptive Noise–Improved Firefly Algorithm Long Short-Term Memory. Based on the evaluation metrics Mean Square Error (MSE), Root Mean Square Error, and Mean Absolute Error, the suggested TLS-LSTM, whose data selection is based on the Hurst exponent (h) value of 0.6026, outperforms SL-LSTM, MLP, and CEEMDAN-IFALSTM. The correlation analysis conducted shows both positive and negative relations between AUD/USD, EUR/AUD, and AUD/JPY which means that a change in AUD/USD will affect EUR/AUD and AUD/JPY as recorded depending on the magnitude of the correlation coefficient (r).

Introduction

Forex (foreign exchange) is a worldwide, unregulated, and extremely stable market for trading currency pairings [1]. Speculating on the overall power of one currency versus another involves FOREX Trading [62]. The liquidity of the FX market exceeds that of the stock market, per the 2019 Triennial Central Bank Survey of FX and Over-the-Counter (OTC) Derivatives Markets, with a daily turnover of $6.6 trillion [6]. It is unique among financial markets in that it is open 24 h a day, five days a week, apart from on weekends. Unlike stocks, the forex market is one of the most sophisticated markets due to its high volatility, highly nonlinear, and inconsistency [1]. The features of Forex are distinct from those of the stock, bond, and other money markets. Because of these distinctions, forex traders have additional trading options to benefit from. The advantages of trading in the forex market include no commissions, no middlemen, no minimum lot size, low transaction costs, abundant liquidity, nearly instantaneous transactions, low margins or high leverage, and activities that are available 24 h a day. The entire market is divided into four overlapping sessions that take place in separate time zones, with the following hours stated in GMT + 1 time: London (8 am–4 pm); New York (1 pm–9 pm); Sydney (10 pm–6 am); Tokyo (midnight–8 am) [9]. Online trading is available, and insider trading is prohibited with few regulations [63]. The majority of research in foreign currency forecasting focuses on well-known markets [16]. The forex market is dominated by retail traders, Central banks, commercial banks, investment firms, and hedge funds. Individuals and institutions are motivated to participate in the Forex market by their financial activities and economic demands, according to [33]. Large investors utilize the FX market to manage their portfolios and prevent exchange rate risk. The FX market is mostly used by retail traders to benefit from short-term currency rate changes. Fundamental and technical analysis are the two most prevalent approaches to evaluate the currency market. Fundamental analysis uses forex news to forecast market trends, including inflation, interest rates, and economic growth [1]. Technical analysis, on the other hand, relies on historical data price charts to give a blueprint for past price action. A technical analyst predicts the future by looking at the past. The much more crucial decision in the Forex market is anticipating the trend of a currency pair's movement. Correctly predicting currency fluctuation can provide a lot of advantages to traders and vice versa. The availability of data, the rise in computing power, and the popularization of machine learning algorithms according to Gonz and Herman [18] have increased the use of technical analysis in forecasting the forex market. While a lot of machine learning and deep learning methodologies are employed in finance, traders are constantly seeking fresh techniques to outsmart the market [16]. Financial time series are particularly noisy in a time series modeling method because their fluctuations might be influenced by factors that are difficult to measure and precisely define in a way that allows them to be used as independent variables [37]. Because of the multiple uncertainties, nonlinearities, and non-Gaussian disturbances present in forex prices, forex prices are particularly difficult to analyze [71]. Tough prediction problems in a range of fields may be solved with deep neural networks [15]. A key element of their effectiveness is their capacity to identify abstract characteristics from raw data. A hybrid of the Gated Recurrent Unit and Long Short-Term Memory (GRU-LSTM) has been used to build a deep learning network framework by Saiful Islam and Hossain [53], to predict the forex market. Deep learning has shown to be a game-changing technology in a range of industries, including pictures, audio, and natural speech processing, and it is now being used to track price movements in the financial sector. Zhao and Khushi [69] analyzed the Forex premises as a digital image, retrieving visual representation with convolutional neural networks (CNN) and subsequent processing with a Light gradient boosting machine (LightGBM).

In the forex market, the most important decision is whether to buy or sell based on your forecasted direction. In this case, the Hurst exponent is used to assess whether our dataset is trending or not before subjecting it to the two-layer stacked LSTM model. The Hurst exponent is indeed a numerical metric used to categorize time series, according to Qian and Rasheed [43], and hence provides a prediction parameter. The forecasting is based on a daily timescale dataset as input, which may be useful for trend traders. Trend trading is a method of attempting to capture profits by analyzing a currency pair in a specific direction. When the currency pair is heading upward, trend traders take a long position, and when the pair is trending downward, trend traders take a short position. For trend trading, a trader stays in the market longer in other to realize a profit. As a result, anticipating the direction of a currency pair's movement, as well as the correlation between the forecasted pair and other currencies in the market, is crucial. Correlation is a statistical term that reflects the degree and direction of a link between two random variables [36]. The correlation analysis is performed to inform the trader as to how to trade another currency pair based on the direction of the predicted pair. Pearson’s correlation analysis is performed between the predicted pair and two other currency pairs to find out the level of correlation between them. Only a tiny fraction of the world's official currencies are traded on the foreign exchange market, only eight currencies, notably the US dollar, account for more than 95 percent of all global Forex operations in a single day. The United States dollar (USD, the Euro (EUR, the British pound (GBP, the Japanese yen (JPY, the Swiss franc (CHF, the New Zealand dollar (NZD, the Australian dollar (AUD, and the Canadian dollar (CAD (CAD the most commonly utilized currencies. Since these currencies are linked with stable political, well-regarded central banks, and stable prices, such liquidity balance is acceptable. The three commodities pairings, AUD/USD, USD/CAD, and NZD/USD, are followed by the four majors, namely EUR/USD, USD/JPY, GBP/USD, and USD/CHF. For these market liquidity issues, FX portfolio trading frequently focuses on this group of seven major currency pairs, with the addition of a couple of other currencies belonging to European Union region states, sometimes for more than a decade, according to FX portfolio trading [37]. According to Vyklyuk and Vuković [60], the most popular currency pair with a 28 percent share of the entire FX market is USD/EUR, followed by USD/JPY with a 14 percent take and USD/GBP aren't far behind with 9 percent. The AUD/USD pair, popularly known as the "Aussie," shows traders how many US dollars (the quotation currency) are required to buy one Australian dollar (the base currency). The other two currency pairs EUR/AUD and AUD/JPY were chosen to perform Pearson’s correlation analysis between them and AUD/USD. The correlation was performed with the Closing price feature of each dataset. This analysis gives a value that determines the degree of linear relationship that exists between the mentioned pairs, how the directional movement in AUD/USD will impact EUR/AUD and AUD/JPY. To make a good profit from the forex market, we pegged our correlation value at $\pm 0.80$ which means that a change in AUD/USD will see a change in EUR/AUD, AUD/JPY. Whether the correlation is positive or negative determines the change in direction.

Enormous intricacy of time series issues like FX trading can be handled with the proposed model with algorithms such as Hurst exponent, a two-layer stacked Long Short-Term Memory (LSTM), and correlation analysis. The details of how two-layer stacked LSTM neural networks for currency trading forecasting work are thoroughly examined, as well as having adequate comparisons of their efficiency against other artificial neural networks like Multilayer Perceptron, single-layer LSTM (Vanilla LSTM), and CEEMDAN-IFALSTM. We looked at how to use the two-layer stacked LSTM technique to anticipate the Foreign Exchange (Forex) market trend to make long-term gains. The study's goal is to anticipate the trend rather than estimate the exchange rate price. According to Baasher and Fakhr [4], there are five measured rates every day: "Open," "Close," "Low," "High," and “Volume”. The goal of this study is to forecast the direction of AUDUSD based on the five measured rates.

The current study contributes to knowledge as follows:

1.
A comprehensive and detailed analysis of selecting Forex dataset based on Hurst exponent
2.
The study also proposed a conceptual framework for forecasting the Forex market using two-layer stacked LSTM
3.
The study also performed a correlation analysis between different currency pairs.

The remaining sections of the current paper are structured as follows. “Literature Review” delve deeper into time series and artificial neural network. “Methodology” section presents the conceptual framework, methods and how data were collected and analyzed. “Analysis and Discussion” section presents the results and a detailed discussion of the outcomes. “Conclusion and Recommendation” section presents the summary of findings and future studies that can be done.

Literature review

A time series is a chronological or time-oriented set of observations on a variable of interest [32]. Time series analysis is essential because it can be used to estimate a population's future potential and is frequently utilized in actual situations as an illustration of a country's population expansion based on an appraisal of its current population [42]. This time series prediction predicts that occurrences that happened in the past will reoccur, based on the findings of the time sequence plot. Activities in the foreign exchange market can be classified as a time series event. Because it operates as a conduit for international money, allowing worldwide business and investment, the foreign exchange market (FOREX) has become a significant institution [17]. Among the world's most significant financial markets is the FOREX market. It is an investment marketplace for the global exchange rate where currencies can be sold or bought on the forex market [8]. The FOREX market, according to Geromichalos and Jung [17], is an over-the-counter market with high bid-ask spreads and intermediation. The difference between the prices at which a dealer will buy and sell a currency is known as the bid-ask price or spread, and it varies from one broker to the next in the retail market. [12] used quotations from an FX dealer to look at the link between order sizes and spreads in the foreign currency (FX) market, and it was concluded that in the inter-dealer market, spreads are unrelated to order sizes; however, in the customer market, they are inversely related.

Artificial neural network

Artificial neural network attempt to mimic human mastering competencies employing modeling brain neurons through the use of computer models. Nerve cells that make up the nervous system's cortical layer are neurons [49]. The architecture and interaction between neurons in neural networks are separated into two groups: feedforward networks and feedback (recurrent) networks. As a static network, feedforward is a collection of linked neurons that represent a nonlinear function of its inputs. Only forward flow of information occurs, from inputs to outputs [57]. Neural networks are trained to lower the loss function's value, which assesses the complete difference between the model's input and the real label. Recurrent neural networks (RNN) are a valuable approach in deep learning. These models closely resemble how humans learn and digest information. RNNs have memory, unlike typical feedforward neural networks [11].

Long short-term memory neural network

Sepp Hochreiter and Jurgen Schmidhuber first proposed the LSTM model in 1997. It's a recurrent neural network with the capability for both long and short-term memory. According to Ulina et al. [58], LSTM cells typically include four layers, three of which are "gates" that allow information to pass through optionally. The three gates that are frequently employed are the forget, input, and output gates. In the forget gate layer, the model decides what information from prior states to keep. Figure 1 represents the structure of a typical LSTM [3]. The model chooses which values to update at the input gate layer. Finally, the cell state's final output is determined by the output gate layer. The ability of this network to facilitate sequential learning is due to its memory cell and long-term dependence processing capability [23].

$${i}_{t}(input\, gate)= \sigma ({W}_{i}.[{h}_{t-1},{x}_{t}]+{b}_{i})$$

(1)

$${f}_{t}(forget\, gate)= \sigma ({W}_{f}.[{h}_{t-1},{x}_{t}]+{b}_{f})$$

(2)

$${O}_{t}(output\, gate)= \sigma \left({W}_{o}.\left[{h}_{t-1},{x}_{t}\right]{b}_{O}\right)$$

(3)

$${\tilde{c }}_{t}\left(cell\right)=\mathrm{tanh}\left({W}_{c}.\left[{h}_{t-1},{x}_{t}\right]+{b}_{c}\right)$$

(4)

$${c}_{t}(cell)={f}_{t}. {c}_{t-1}+{i}_{t} .\mathrm{tanh}\left({W}_{c}.\left[{h}_{t-1},{x}_{t}\right]+{b}_{c}\right)$$

(5)

$${h}_{t}={O}_{t} .\mathrm{tanh}({)c}_{t},$$

(6)

where ${W}_{f},{W}_{o},{W}_{c} ,{W}_{i}$ represent the weights. ${b}_{i},{b}_{f},{b}_{O},{b}_{c}$ represent the bias.

$\sigma$ is the activation function that detects states from the previous phase and produces some ${f}_{t}$ between 0 and 1 to govern the amount of information flow that remains. ${i}_{t}$ is the input gate that is coupled with the candidate value, ${c}_{t}$ after $\mathrm{tanh}$ layer to update the state, then the old cell state ${c}_{t}$ which represents long-term memory, can be replaced by the new one. The multiplication of ${O}_{t}$ and $\mathrm{tanh}({c}_{t}$) yields ${h}_{t}$ utilized as the input for the subsequent LSTM cell.

Dobrovolny et al. [13] used Long Short-Term Memory neural network to anticipate the EUR/USD price, evaluating and proposing the best time block for predicting based on daily FOREX data, and concluding single LSTM layer performs well.

Assessing the subsequent behavior of a currency is a primary concern for governments, financial institutions, and investors, according to Escudero et al. [14], using this type of analysis to understand a country's economic situation and determine when to sell and buy goods or services from that country. This sort of time series is forecasted with acceptable accuracy using a variety of models. However, getting strong predicting performance is difficult due to the unpredictable character of these time series. Wei and Li [61] developed a Multi-Channel LSTM network for predicting foreign exchange rates utilizing data from different time scales. Features gleaned from several channels give complimentary data, which is used to forecast foreign exchange changes.

Stacked long short-term memory networks

Stacked LSTM is a variation of a single hidden Layer LSTM model, it contains numerous buried LSTM layers, each having many memory cells [15]. A much more accurate term for stacked LSTM is "deep learning," since the model gets progressively sophisticated due to the numerous hidden layers. The complexity of neural networks is typically credited with their efficiency on a wide range of difficult prediction tasks. Stacked LSTMs have now been validated for tough sequence prediction issues. This is because it is a deep learning recurrent neural network that outperforms deep feedforward neural networks in terms of model parameter efficiency, converges quickly, and efficiently uses model parameters [16]. Further to this explanation is Fig. 2 which is a stacked LSTM architecture that has many LSTM layers [68].

Hurst exponent

Harold Edwin Hurst was a hydrologist who devoted nearly his entire career to combating reservoir control concerns in Egypt [50]. He looked at how the reservoir's range moved around its average level, assuming that subsequent influxes were random (i.e., statistically independent), and the range would grow in lockstep with time [50]. Hurst created a dimensionless statistical exponent by dividing the corrected range by the standard deviation of the observations to discover support from the Nile River data. Rescaled range analysis (R/S analysis) is the name given to this method [50]. The Hurst exponent is a measure of fractality and long-term memory in a time series [43]. The first question we seek to address in time series forecasting is whether the time series under consideration is predictable. If the time series is random, all approaches are likely to fail, thus identifying time series with some predictability is key [43]. Since it is resilient and makes minimal guesses about the underlying system, it is useful for time series analysis. The severity of this tendency increases as H approaches 1.0, indicating that time series with a large H are more predictable than those with a small H. The Hurst exponent is a statistic that can offer information on correlation and persistence in a time series, according to Raimundo and Okamoto Jr [52]. The Hurst exponent does not remain constant throughout time, but it does fluctuate [31].

Correlation analysis

Correlation is a statistical measure of the relationship between two effects in finance, with correlation coefficients ranging from −1 to + 1. A perfect negative correlation is −1, a perfect positive correlation is + 1, and there is no correlation between the two variables is zero. On the one hand, some currencies tend to move in the same direction, while others do not [47]. For someone who trades many currency pairs, this is essential knowledge. As a consequence, the trader is ready to take steps to hedge, diversify, or capitalize on a double position benefit as a result of the event. Pearson’s correlation, Kendall rank correlation, Spearman correlation, and the Point-Biserial correlation are the four types of correlations that are commonly assessed in statistics [41]. By comparing the positive and negative trends, the strength is estimated. Traders utilize longer and multiple time frames to anticipate future market moves with Matrix Connection for Indicator Setup [19].

Pearson’s correlation

The Pearson’s product–moment correlation coefficient is a dimensionless indicator that remains unchanged when either variable is linearly transformed. In 1895, Pearson devised the mathematical formula for this crucial metric: this or a basic algebraic variant of it is the most popular formula in introductory statistics textbooks. The numerator is centered by subtracting the mean of each variable from the raw scores, and the sum of cross-products of the centered variables is obtained. According to Rodgers and Wander [25], the denominator balances the scales of the variables so that they have the same number of units. The equation defines correlation (r) as the centered and normalized sum of the cross-products of two variables. Nagpure [34] used Pearson’s correlation to calculate the correlation coefficient for the currencies USD, EUR, JPY, GBP, AUD, CAD, CHF, CNY, SEK, NZD, MXN, and INR using data from 30 to 39 years up to December 2018. The result is a matrix with the correlation coefficient for each currency pair. Currencies are linked and have an impact on one another. Currencies move in the same direction most of the time during the observed interval when they have a strong positive correlation (close to 1) and in opposite directions most of the time during the observed interval with a strong negative correlation (close to −1) [70].

Related research

Zanc et al. [66] developed a conceptual design for an Intelligent Trading System based on a stacked LSTM architecture module and centered on a Financial Forecasting Module. They tested the proposed LSTM-based Forecasting Module using datasets containing the evolution of the forex and cryptocurrency financial markets, finding that even if the forecasted values' error is low, the forecast cannot help the Intelligent Trading System because it is just a shifted filtered version of the actual price.

The Chinese Yuan (CNY) exchange rate is difficult to anticipate because of its complex linking structure. This includes market-level coupling resulting from interactions with many financial markets, macro-level coupling resulting from interactions with economic fundamentals, and deep coupling resulting from the combination of the two types of couplings. This paper proposes DC-LSTM, a unique deep coupled Long Short-Term Memory (LSTM) technique for forecasting the USD/CNY exchange rate that captures the complicated couplings. To model the intricate connections, this approach uses a deep structure made up of stacked LSTMs. The suggested technique beats seven existing benchmarks, according to experimental results based on 10-year data. Through a profitability debate, the DC-LSTM has been proven to be a beneficial tool for making prudent investment decisions. The goal of this study is to explain why coupling learning is important for exchange rate forecasting and why a deep coupled model is effective for capturing the couplings [7].

Wei and Li [61] present a unique Multi-Channel LSTM network for foreign exchange rate prediction using time series data from various time scales. Features derived from various channels give complimentary data, which is combined to forecast foreign exchange changes. To analyze the model's statistical and financial performance, they used real market data for EURUSD and EURAUD. The Multi-Channel LSTM model outperforms baseline models in statistical measures and provides a dependable trading strategy in terms of profitability, according to the results. For a successful monetary policy, the capacity to accurately forecast exchange rates is critical. When trained on input characteristics properly constructed by domain knowledge professionals, machine learning methods such as shallow neural networks provide superior forecast accuracy than time series models [16]. Stacked Long Short-Term Memory (LSTM) is a deep recurrent neural network that utilizes model parameters well, accumulates swiftly, and performs better than deep feedforward neural networks.

Deep neural network models such as convolutional neural networks (CNN), RNN, Transformers, and others, according to Li et al. [24], outperform ordinary machine learning in many prediction tasks. Deep neural network (DNN)-based financial forecasting research, however, still has potential for improvement due to the irregularity, ambiguity, and volatility of financial markets, as well as the complexity of financial time series data. The study examines 15-min timeframe (M15) historical data for the AUDUSD and EURUSD pairings to estimate future highs and lows within five timestamps. The "model stacking" method is as follows: As candidate models, XGBoost, Random Forest, LightGBM, LSTM, and GRU were chosen: reviewing the results after applying the model stacking approach to every subgroup of eligible models; As the final pieces of stacking, the most effective model subset was identified. Models that are stacked outperform single models according to the findings.

Lin et al. [28] proposed a new hybrid model for Multilayer Long Short-Term Memory (MLSTM) networks based on complete ensemble empirical mode decomposition (CEEMDAN). It addresses the drawbacks of prior approaches. Using ensemble empirical mode decomposition, data were rebuilt with less computation (EEMD). They compared the suggested strategy to numerous mainstream approaches or other hybrid models to obtain an objective evaluation. In a range of test evaluations, CEEMAN-MLSTM beats many well-known and acknowledged models, according to the findings of the study. The MLSTM model can learn more complex dependencies from exchange rate data than the standard model, and it can even learn temporal sequences. Many tests have been carried out to assess the proposed system's performance (Fig. 3).

Methodology

The methodology seeks to ensure the selection of a good dataset using Hurst exponent, the efficiency of two-layer stacked Long Short-Term Memory Neural Network to forecast the trend, and correlation analysis to look at the linear relationship between currency pair, as described in the study objectives. This study's methodology is an adaption of a traditional machine learning problem-solving approach that follows the conceptual framework as proposed. The basic goal of this architecture or framework is to satisfy all of the proposed work's implementation needs and analyze the relevance, topic matter, and context of the problem at this stage to acquire a better knowledge of it. One major gap that was identified during an extensive literature review is that a lot of researchers arbitrarily chose dataset without any clear reason; secondly, we also realized that LSTM has been extensively used in Forex forecasting, but there was the need to increase its performance by adding another layer to the single layer; last but not least, we also realized that the movement in one currency affects another just as a change in one economy affects another, hence the need to look at how a change in one currency affects another. Numerous modern machine learning platforms allow researchers to test and run various models on diverse quantities of data. One of these computing platforms with strong processing capabilities is chosen for the study after careful analysis of the research objectives. Platforms with extensive machine learning and data science libraries, as well as significant online computing power, should be given special consideration. Because machine learning implementations are usually computationally intensive, an online platform that connects to supercomputing platforms for machine learning is required.

Data collection

A fundamental problem in the machine learning research paradigm is obtaining and using correct and clean data for both the training and testing phases of the process. Null columns or entries, as well as other noisy characteristics, are trimmed out during this session to make the data clearer. Several data cleaning processes are utilized depending on the scenario. To learn more about how to cope with missing data, go to [5]. According to the scientific community, these tactics offer both benefits and drawbacks. Instance selection, for example, may be used to deal with noise and to overcome the difficulties of learning with large datasets. In an essence, instance selection is a problem of optimizing mining quality while reducing sample size [29]. There are a variety of approaches that may be investigated, including selection. A method for finding and deleting duplicate and unnecessary properties is feature subset selection [64]. The dimensionality of the dataset is decreased, allowing for faster processing and more efficiency. When many traits are unduly dependent on one another, a phenomenon occurs. Feature transformation or construction could be utilized to tackle this problem. These newly created characteristics might lead to more concise and accurate information [30]. Autoencoders or label-encoders could be used as part of data preparation to convert some features into categorical data formats for simple adaptation to the models. We used the daily data frame to provide a more trending view of the market without the usage of any technical indicators for a more reliable price movement. Our research employed daily (1D) exchange rates datasets for the AUDUSD, EURUSD, and AUDJPY collected from FXpro (broker) through the Metatrader 4 trading platform, which is the most extensively used trading platform among brokers according to [19]. The forecasting is done with AUDUSD. For the Pearson’s correlation analysis, the close price of each dataset is retrieved. Each dataset has 1772 records with a period spanning from 2013/05/02 to 2020/03/01.

Features of the dataset

1.
Date: the date on which price action happened in the market in this case over a 24-h period
2.
Open: the starting contract price for the time. Since it is the consensus price after all interested parties have had time to "sleep on it," the open is especially important for reviewing daily data.
3.
Close: the price of the most recent trade made within the period. For the most part, this price is considered in the analysis. Most specialists regard the relationship between the open and close price to be noteworthy.
4.
High: during the period, this was the asset's highest price. It was at this stage that there were more sellers than purchasers.
5.
Low: during the period, this is the asset's lowest price. It's when the number of customers outnumbers the number of sellers.
6.
Volume: this is the total quantity of assets traded within the specified period. The relationship between pricing and volume is critical, for example. Price increases are accompanied by increases in volume [62].

Hurst exponent

The Hurst exponent is a statistic that may be used to detect if a time series is correlated and persistent [52]. Its value is used to examine whether or not financial time series are predictable [31]. Generally Hurst exponent has a mathematical output ranging from 0 to 1. According to literature, calculation of Hurst exponent of the time series can be in various ways, such as the range to standard deviation ratio which was adapted in this current study. The approach works by examining the average rescaled range of cumulative departures from a series' mean values [31]. The procedure for computing the R/S statistic of financial data series is as follows:

In terms of the rescaled range, the Hurst exponent, H, is defined as follows:

$$E\left[\frac{R(n)}{S(n)}\right]=C{n}^{H} as n\to Time\,of\,last\,observation$$

(7)

where the rescaled range is $E\left[\frac{R(n)}{S(n)}\right]$. The anticipated value is $E\left[x\right]$. The time of the last observation (e.g., ${X}_{n}$ in the input time series data) is represented by $n$. $C$ represents a constant.

The Hurst exponent is an autocorrelation metric (persistence and long memory). For a time, series, the rescaled range is determined.

$$X={X}_{1}, {X}_{2}, {X}_{3},\dots , {X}_{n},$$

as follows:

1.
Calculate the mean:
$$m=\frac{1}{n}\sum_{i=1}^{n}{X}_{i}$$
(8)
2.
Make a mean-adjusted series of data:
$${Y}_{t}={X}_{t}-m\, for\quad t=\mathrm{1,2},\dots , n$$
(9)
3.
Calculate the cumulative deviate series Z using the following formula:
$${Z}_{t}=\frac{1}{n}\sum_{i=1}^{n}{Y}_{i}\quad for\quad t=\mathrm{1,2},\dots , n$$
(10)
4.
Make a R: range series.

$${R}_{t}=max\left({Z}_{1},{Z}_{2},\dots .,{Z}_{t}\right)-\mathrm{min}({Z}_{1},{Z}_{2},\dots .,{Z}_{t}\quad for\; t=\mathrm{1,2},\dots ,n$$
(11)
5.
Make a set of standard deviations R:
$${S}_{t}\sqrt{\frac{1}{t}\sum_{i=1}^{t}({X}_{i}-u{)}^{2}}\, for\, t=\mathrm{1,2},\dots .,n$$
(12)

where
$$H=is\, mean\, for\, the\, time\, series\, values {X}_{1}, {X}_{2}, {X}_{3},\dots , {X}_{t}$$
6.
Determine the R/S (rescaled range series):

$$(R/S{)}_{t}= \frac{{R}_{t}}{{S}_{t}}\; for\; t=\mathrm{1,2},\dots ,n.$$
(13)
7.
The Hurst exponent is calculated by fitting a power-law model to the data.
$$E\left[\frac{R\left(n\right)}{S\left(n\right)}\right]=C \times {n}^{H}\mathrm{to\;the\;data}.$$
(14)
8.
This is achieved by fitting a straight line to both sides' logarithms. H is the result of the line's slope (i.e., Hurst exponent estimates).
9.
The approach above is known to yield a skewed estimate of the power-law exponent, and there is a departure from 0.5 slopes for a small data set (i.e., white noise). The white-noise theoretical value of the R/S statistics, according to Anis-Lloyd, is as follows:
$$E\left[\frac{R\left(n\right)}{S\left(n\right)}\right]=\left\{\begin{array}{c}\frac{\Gamma \left(\frac{n-1}{2}\right)}{\sqrt{\pi }\Gamma \left(\frac{n}{2}\right)} \sum_{i}^{n-1}\sqrt{\frac{n-i}{i} } ,\quad for\, n\le 340\\ \\ \\ \frac{1}{\sqrt{n\frac{\pi }{2}}}\sum_{i=1}^{n-1}\sqrt{\frac{n-i}{i} }, \quad for n>340\end{array}\right.$$
(15)

where Γ is the Euler Gamma Function.
10.
While most Hurst exponent estimators have yet to be given an asymptotic distribution theory, an approximate functional form for the Anis-Lloyd-corrected R/S estimate is available. The functional form for confidence interval bounds for a 95 percent confidence interval is as follows:
11.
$$LL=0.5-\frac{{e}^{4.21}}{\mathrm{ln}(M{)}^{7.33}}$$
(16)
12.
$$UL=0.5+\frac{{e}^{4.77}}{\mathrm{ln}(M{)}^{3.10}}$$
(17)

where $M={\mathrm{log}}_{2}(N)\; {\text{N }} = {\text{ series length}}$
13.
Lastly, the R/S Hurst adjusted Anis-Lloyd exponent is calculated as 0.5 plus the slope of
$$\frac{R\left(n\right)}{S\left(n\right)}-E[\frac{R\left(n\right)}{S\left(n\right)}]$$
(18)

Hurst et al. [21] and Shah and Parikh [56]

Two-layer stacked LSTM

From Fig. 4, the LSTM network architecture [24] for numerical research consists of one input layer, two LSTM layers, and one output layer. One neuron is present in the output layer, whereas 128 LSTM cells are present in each of the first and second layers. The anticipated exchange rate ${x}_{k}$ is determined by ${h}_{k-1}$.

$${x}_{k}={W}_{s}{h}_{k-1}+{b}_{s}$$

(19)

${W}_{s}$ denotes the weight, while ${b}_{s}$ denotes the bias. ${h}_{k-1}$ is dependent on ${x}_{1},{x}_{2}, \dots , {x}_{k-1}$ in the input layer and ${h}_{1},{h}_{2},{\dots .,h}_{k-2}$ in the output layer for each LSTM cell.

$${h}_{k}={O}_{k} .\mathrm{tanh}({{C}_{k}),}$$

(20)

where

$${c}_{k}(cell)={F}_{k}. {C}_{k-1}+{I}_{k} .\mathrm{tanh}\left({W}_{x}^{c}{x}_{k}+ {W}_{h}^{c}{h}_{k-1}+{b}_{c}\right)$$

(21)

$${F}_{k}(forget\,gate)= \sigma ({W}_{x}^{F}{x}_{k}+ {W}_{h}^{F}{h}_{k-1}+{b}_{F})$$

(22)

$${I}_{k}(input\,gate)= \sigma ({W}_{x}^{I}{x}_{k}+ {W}_{h}^{I}{h}_{k-1}+{b}_{I})$$

(23)

$${o}_{k}(output\,gate)= \sigma \left({W}_{x}^{O}{x}_{k}+ {W}_{h}^{O}{h}_{k-1}+{b}_{O}\right)$$

(24)

Here ${W}_{x}^{F}$,${W}_{k}^{F}$,${W}_{x}^{I}$, ${W}_{k}^{I}$,${W}_{x}^{O}$,${W}_{k}^{O}$, ${W}_{x}^{C}$,${W}_{k}^{C}$ represent the weights. ${b}_{F,}{b}_{I},{b}_{O},{b}_{C}$ represent the bias.

$\sigma$ is the activation function that detects states from the previous phase and produces some ${F}_{t}$ between 0 and 1 to govern the amount of information flow that remains. ${I}_{k}$ is the input gate that is coupled with the candidate value, ${c}_{k}$ after $\mathrm{tanh}$ layer to update the state, then the old cell state ${c}_{k}$ which represents long-term memory, can be replaced by the new one. The multiplication of ${O}_{k}$ and $\mathrm{tanh}({c}_{k}$) yields ${h}_{k}$ utilized as the input for the subsequent LSTM cell.

Pearson’s correlation

Below are formulae for calculating r using Pearson’s method:

$$r=\frac{\sum ({X}_{i}-\overline{X })({Y}_{i}-\overline{Y })}{[\sum ({X}_{i}-\overline{X }{)}^{2}\sum ({Y}_{i}-\overline{Y }{)}^{2}{]}^{1/2}}$$

(25)

$r=\mathrm{ denotes\, the\, correlation\, coefficient}.$ ${{\varvec{x}}}_{{\varvec{i}}}$ = the x-values variable's in a sample. $\overline{{\varvec{x}} }$ = the average of the x-values variable's. ${{\varvec{y}}}_{{\varvec{i}}}$ = the y-values variable's in a sample. $\overline{{\varvec{y}} }$ = the average of the y-values variable's.

Every 'Close' feature from the AUDUSD, EURAUD, and AUDJPY datasets were extracted and used to calculate the correlation value between the x and y values of each pair.

Decision making

The correlation coefficient (r) generated is used to advise a potential trader to trade EUR/AUD and AUD/JPY depending on the degree of relation between them and AUD/USD for high performance and making a good return we will encourage traders to trade on if the correlation coefficient (r) is ± 0.8 or better as stated in Fig. 3.

Evaluation metrics

It is necessary to employ evaluation metrics to obtain a decent model. In essence, a survey through available evaluation metrics could reveal the important components and factors that suit a particular dataset or problem. One of the most machine learning problems is regression problems and as a result, some metrics could be employed to evaluate predictions. Some are:

1.
Mean Squared Error
2.
Root Mean Square Error
3.
Mean Absolute Error

Mean Square Error

The Mean Square Error (MSE) or Mean Squared Deviation (MSD) of an estimator (of a process for estimating an unobserved variable) in statistics measures the average of the squared errors, or the average squared difference between the estimated values and the actual value. MSE is a risk function that shows how much a squared error loss is estimated to cost. Who are we to believe? There is a need for clarification. Due to randomness or the estimator's failure to account for information that may yield a more precise estimate, MSE is almost always strictly positive (rather than zero). Mathematically, it is as follows:

$$MSE=\frac{1}{n}\sum_{i=1}^{n}({Y}_{i}-\widehat{Y})$$

(26)

${Y}_{i}=measured\,value$, $\widehat{Y}=true\,value$, n = sample size.

Root Mean Square Error

The residuals' standard deviation (prediction errors). The RMSE indicates how evenly distributed the residuals are, whereas residuals indicate how much the data points vary from the regression line. Put it another way, it shows how densely the data are packed around the best fit line. The Root Mean Square Error is commonly used to analyze experimental results in climatology, forecasting, and regression research. It may be quantitatively stated as follows:

$$\mathrm{RMSE}=[\frac{1}{n}{\sum }_{i=1}^{n}({y}_{i}-{\widehat{Y}}_{i}{)}^{2}{]}^\frac{1}{2}$$

(27)

$Y_{i} = measured\,value$, $\hat{Y} = true\,value$, n = sample size

Mean Absolute Error

The Mean Absolute Error (MAE) is a statistic for evaluating regression models. The average of all individual prediction errors on all occurrences in the test set is the Mean Absolute Error of a model when it comes to a test set. The discrepancy between the real and expected values for each occurrence is referred to as a prediction error. It may be expressed numerically as follows:

$$\mathrm{MAE}={\sum }_{t=1}^{k}\left|\frac{y\left(t\right)-\dot{y(t)}}{y(t)}\right|$$

(28)

and where the real and anticipated values of the data are $y_{i}$, and $\hat{y}_{i}$, respectively, and n is sample size [24].

Analysis and discussion

To get the desired results, the chapter executes and tests the carefully established method. While fine-tuning the data cleaning technique, the comprehension of the AUD/USD dataset under consideration is taken into account so that it becomes optimal clean data for the underlying approach. Collab, Google's open-source and online machine learning platform, is used for a range of purposes, including research that requires a large amount of computer capacity and a large number of machine learning libraries. It allows machine learning professionals to create and run python programs on Google's cloud servers with little setup. With features like simple sharing and free access to online Graphics Processing Units, this platform is an excellent fit for our project (GPUs). It incorporates the important machine learning languages listed below, making the decision even easier.

With so many permutations to choose from, 75 percent of the dataset was randomly chosen for training, with the remaining 25% designated for testing. Furthermore, feature scaling is used to bring all features to the same magnitude level. This signifies that the data will be structured into a range of 0 to 1.

Data scaling: MinMaxScaler

MinMaxScaler scales all data features in the range [0, 1] or [−1, 1] if the dataset contains negative values. All inliers in the narrow range [0, 0.005] are compressed using this scale. The interpolate technique was used to handle missing data, and the minmaxscaling method was used to scale the data. Every part was mounted to a certain range, which changed the features. Each component is scaled and interpreted individually, and it is in the specified field on the training dataset. [66].

The conversion is given by:

$$MinMaxScaler({x}_{i})= \frac{({x}_{i}-\mathrm{min}(x)}{\mathrm{max}\left(x\right)-\mathrm{min}(x)}$$

(29)

Unit variance scaling, which employs zero means, is usually replaced by this conversion. Because lower numbers are simpler for backpropagation to manage, the data must be scaled between 0 and 1.

Results and discussion

Hurst exponent

By applying the Hurst exponent on our dataset, we had the assurance that our dataset was good for the forecasting (Fig. 5).

The Hurst exponent, H, has values ranging from 0 to 1, and a time series can be categorized into one of three groups based on its value:

1.
A random series is indicated by H = 0.5 which is ‘mean_rev’;
2.
An anti-persistent series is shown by 0 < H < 0.5 is ‘gbm’;
3.
A persistent series is indicated by 0.5 < H < 1 ‘trending’. The 'mean reversion' property of anti-persistent series suggests that an upward trend is likely to be followed by a descending trend, and vice versa. The strength of 'mean reverting' grows as H approaches 0.0. A persistent series, on the other hand, is trend reinforcing, meaning that the direction of future values (up or down) will most likely be the same as the current values [44].

From the graph, the slope gives the Hurst exponent value, a dataset of 1772 records, and five features, and we had a Hurst exponent of is 0.6026 indicating that our dataset is trending and good for forecasting in the forex market. In our dataset, Closing price feature was used as our target feature meaning that that is the final price of the day and the trend is determined based on the Closing price which is the dependent variable (Fig. 6).

Two-layer stacked LSTM

The stack's first LSTM layer has 128 hidden units and a rectified ('relu') activation function, while the second LSTM layer has 128 units and a rectified ('relu') activation function as well. Activation functions add nonlinearity into the model, allowing deep learning models to learn nonlinear prediction boundaries. One of the disadvantages of neural networks is their sensitivity to overfitting. Regularization is a technique for preventing overfitting. There are two basic techniques: dropout and early stopping. When we built our model, we picked a Dropout of 0.2 input at the time of model training. The number of times the training dataset is introduced to the neural network is determined by the epoch, which was set at 32. The batch size was set to 100, which represents the total number of training examples in one batch (number of units a dataset is divided into). Optimizers are methods or strategies for reducing losses by changing the parameters of a neural network, such as the weights and learning rate. By minimizing the function, optimizers are employed to address optimization issues. We chose Adam optimizer for our model because it has a lower training cost.

Figure 7 shows the lines of true trend against the predicted indicating a downward trend of AUDUSD meaning a trader can take a sell position, and when the price continues to fall, the trader will make some gain from the market. Our proposed two-layer LSTM is compared with the baseline models based on the evaluation metrics.

From the values of MSE, RMSE, and MAE in Table 1, we can conclude that our suggested model outperforms the baseline models of MLP, LSTM, and CEEMDAN-IFA-LSTM. The first section of the table compares our chosen dataset AUD/USD from April 1, 2013 to December 30, 2020 to the baseline MLP and LSTM models; the bolded results clearly illustrate that the proposed model beats the baseline models. From Table 1, the proposed model is compared with the work of Ulina et al. [58], who proposed CEEMDAN-IFA-LSTM for forecasting AUD/USD with their dataset spanning from January 1, 2010, to December 30, 2019, with the daily time. Their dataset was obtained from www.investing.com and subjected to our proposed model, and the results compared with our proposed model show that our proposed model performed better based on RMSE and MAE as shown in Table 1

Table 1 Proposed model comparison with baseline models based on evaluation metrics

Full size table

Correlation analysis

Table 2 shows the results of Pearson's correlation analysis between the Closing prices of the datasets AUDJPY, AUDUSD, EURAUD. From the table, the r value as stated in Fig. 3 between AUDJPY and AUDUSD is 0.798979 indicating a positive relationship between these two pairs meaning an uptrend in AUDJPY will see an uptrend in AUDUSD and a downtrend in AUDUSD will see an uptrend in AUDJPY. Also, from the table, the r value between AUDUSD and EURUSD is −0.639265 showing that the relationship between these pairs is negative which means that a rise in AUDUSD will see a fall in EURUSD. For positive correlation, the closer the value is to 1 shows how strong the positive correlation is, and for negative correlation the closer the value is to negative 1(−1) the better the relationship. Negative correlation indicates that when selling the one you must buy the other pair. Positive correlation implies that when selling one pair you must also sell the other pair and when buying one pair you must also buy the other. Since we pegged the magnitude of our correlation value at ${}_{-}{}^{+}0.80$ we will not advise a trader to trade for example who is trading AUDUSD to trade AUDJPY and EURUSD because the magnitude of the correlation value did not meet our initial target since we are interested in a high and very good correlation that will increase our chances of making money from the market.

Table 2 Pearson’s correlation coefficient between AUDJPY, AUDUSD, EURAUD

Full size table

Conclusion and recommendation

Conclusions

The paper offered a conceptual framework that is centered on a Forex forecasting module that is created utilizing Hurst exponent, two-layer stacked LSTM architecture and correlation analysis. Using AUDUSD datasets, we assessed the proposed framework-based forecasting module and ran a Pearson’s correlation study with AUD/USD, EUR/AUD, and AUD/JPY. Due to the considerable income and economic benefits, it brings, Forex forecasting is an appealing research field. Academics are interested in forex prediction since it is a tough time series topic. Numerous studies, including statistical and machine learning approaches, have been undertaken on currency forecasting. First, a complete literature review on currency forecasting was undertaken in this paper. For the forex market time series, the suitability of the proposed model over SL-LSTM, MLP, and CEEMDAN-IFA-LSTM was explored. The module outperformed MLP, single-layer LSTM, and CEEMDAN-IFA-LSTM in terms of forgetting, remembering, and updating information. The suggested framework is well suited to learning from experience to categorize, analyze, and predict time series with unpredictably long-time delays and boundaries between critical events. Time series prediction is the study of patterns that evolve the right reaction at a specific moment in time, and it is based not only on the present value of the observable but also on the value of the observable in the past. As a consequence, the outcomes of the study are encouraging. The proposed model beats the standard Single LSTM, MLP, and CEEMDAN-IFA-LSTM by reducing Mean Square Error, Root Mean Square Error, and Mean Absolute Error. This research shows that using Hurst exponent to select the dataset and adding another layer to the LSTM can outperform other models. The Hurst exponent is one of the most important algorithms in the financial world. It gave us a hint about our dataset, indicating that it was trending and good for predicting and that it should be trusted when making key decisions. The linear relationship between instruments is examined using correlation. A high correlation between currency pairs can have a significant influence on one another and can be used to advise traders to trade another currency based on what is happening with the other.

Recommendations

The following recommendations were made based on the findings of the study in order to further improve performance.

1.
While this study used the AUD/USD forex pair as an example for Forex forecasting, the model may also be used to investigate other important currency pairs, such as the EUR/USD, which is the world's most traded currency pair.
2.
A change in the data size of the AUD/USD to quickly assess the model's performance on very huge data. This could be a way to avoid occurrences of overfitting.
3.
The performance of such a model may be enhanced for forex time series forecasting using a hybrid of two-layer stacked LSTM and a GRU (Gated Recurrent Unit) for forecasting EURUSD, for future work.
4.
Future studies can do a comparative analysis of deep learning models in Forex forecasting to see if other models can outperform stack LSTM.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

FOREX:: Foreign Exchange
FX:: Foreign Exchange
LSTM:: Long Short-Term Memory Network
AUD/USD:: Australian Dollar and United State Dollar
EUR/AUD:: Euro and Australian Dollar
AUD/JPY:: Australian Dollar and Japanese Yuan
H:: Hurst exponent
CEEMDAN-IFALSTM:: Complete Ensemble Empirical Mode Decomposition with Adaptive Noise–Improved Firefly Algorithm Long Short-Term Memory (IFALSTM).
ANN:: Artificial Neural Networks
DNN:: Deep Neural Network
FNN:: Feedforward Neural Network
RNN:: Recurrent Neural Network
SRNN:: Simple Recurrent Neural Network
GRU:: Gated Recurrent Units
CNN:: Convolutional neural networks
RNN:: Recurrent Neural Networks
MLP:: Multilayer Perceptron
MSE:: Mean Squared Error
RMSE:: Root Mean Square Error
MAE:: Mean Absolute Error
r:: Correlation Coefficient
ReLU:: Rectified Unit
OTC:: Over-the-Counter
HCRBFNON:: Hybrid Chaotic Radial Basis Function Neural Oscillatory Network Model

References

Ahmed S, Hassan SU, Aljohani NR, Nawaz R (2020) FLF-LSTM: a novel prediction system using Forex Loss Function. Appl Soft Comput J 97:106780. https://doi.org/10.1016/j.asoc.2020.106780
Article Google Scholar
Alameer Z, Elaziz MA, Ewees AA, Ye H, Jianhua Z (2019) Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm. Resour Policy 61(January):250–260. https://doi.org/10.1016/j.resourpol.2019.02.014
Article Google Scholar
Aryal S, Nadarajah D, Kasthurirathna D, Rupasinghe L, Jayawardena C (2019) Comparative analysis of the application of Deep Learning techniques for Forex Rate prediction 329(1):329–333
Google Scholar
Baasher AA, Fakhr MW (2011) Forex trend classification using machine learning techniques. In: Proceedings of the 11th WSEAS international conference on applied computer science, January 2011, pp 41–47. http://www.wseas.us/e-library/conferences/2011/Penang/ACRE/ACRE-05.pdf
Batista GEAPA, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533. https://doi.org/10.1080/713827181
Article Google Scholar
BIS (2019) Foreign exchange turnover in April 2019: preliminary global result. Triennial Central Bank Survey, September, 24. https://www.bis.org/statistics/rpfx19_fx.pdf
Cao W, Zhu W, Wang W, Demazeau Y, Zhang C (2020) A deep coupled LSTM approach for USD/CNY exchange rate forecasting. IEEE Intell Syst 35(2):43–53. https://doi.org/10.1109/MIS.2020.2977283
Article Google Scholar
Contreras AV, Llanes A, Pérez-Bernabeu A, Navarro S, Pérez-Sánchez H, López-Espín JJ, Cecilia JM (2018) ENMX: an elastic network model to predict the FOREX market evolution. Simul Model Pract Theory 86:1–10. https://doi.org/10.1016/j.simpat.2018.04.008
Article Google Scholar
Czarnowski I, Caballero AM, Howlett RJ, Jain LC (2016) Preface. Smart Innov Syst Technol 56:v. https://doi.org/10.1007/978-3-319-39627-9
Article Google Scholar
D’Lima N, Khan SS (2015a) FOREX rate prediction using a Hybrid System. 3:4–8
Dautel AJ, Härdle WK, Lessmann S, Seow H-V (2020) Forex exchange rate forecasting using deep recurrent neural networks. Digital Finance 2(1–2):69–96. https://doi.org/10.1007/s42521-020-00019-x
Article Google Scholar
Ding L (2009) BID-ask spread and order size in the foreign exchange market: An empirical investigation. Int Rev Econ Finance 14(1):98–105. https://doi.org/10.1002/ijfe.365
Article Google Scholar
Dobrovolny M, Soukal I, Lim KC, Selamat A, Krejcar O (2020) Forecasting of FOREX price trend using recurrent neural network - long short-term memory. Proc Int Sci Conf Hradec Econ Days 2020 10:95–103. https://doi.org/10.36689/uhk/hed/2020-01-011
Escudero P, Alcocer W, Paredes J (2021) Recurrent neural networks and ARIMA models for euro/dollar exchange rate forecasting. Appl Sci (Switzerland) 11(12):1. https://doi.org/10.3390/app11125658
Article Google Scholar
Galeshchuk S (2017) Deep networks for predicting direction of change in foreign exchange rates. April 2016. https://doi.org/10.1002/isaf.1404
Galeshchuk S, Mukherjee S (2017) Deep learning for predictions in emerging currency markets. In: ICAART 2017 - Proceedings of the 9th International Conference on Agents and Artificial Intelligence 2:681–686. https://doi.org/10.5220/0006250506810686
Geromichalos A, Jung KM (2018) An over-the-counter approach to the forex market. Int Econ Rev 59(2):859–905. https://doi.org/10.1111/iere.12290
Article MathSciNet MATH Google Scholar
Gonz C, Herman M (2018) Foreign exchange forecasting via machine learning
Handayani I, Rahardja U, Febriyanto E, Yulius H, Aini Q (2019) Longer time frame concept for foreign exchange trading indicator using matrix correlation technique. In: Proceedings of 2019 4th international conference on informatics and computing, ICIC 2019. https://doi.org/10.1109/ICIC47613.2019.8985709
Jung G, Choi S (2021) Autoencoder-LSTM Techniques
Hurst T, Hurst HE, Otto L (2010) Hurst exponent Generalized exponent, pp 4–5
Kondratenko VV, Kuperin YA (2003) Using Recurrent Neural Networks To Forecasting of Forex. http://arxiv.org/abs/cond-mat/0304469
Kumar K, Haider MTU (2021) Enhanced prediction of intra-day stock market using metaheuristic optimization on RNN–LSTM network. In: New Generation Computing (vol 39, Issue 1). Ohmsha. https://doi.org/10.1007/s00354-020-00104-0
Lee CI, Chang CH, Hwang FN (2019) Currency exchange rate prediction with long short-term memory networks based on attention and news sentiment analysis. In: Proceedings - 2019 international conference on technologies and applications of artificial intelligence, TAAI 2019, March. https://doi.org/10.1109/TAAI48200.2019.8959884
Lee Rodgers J, Wander AN (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1):59–66. https://doi.org/10.1080/00031305.1988.10475524
Article Google Scholar
Leslie Tiong Ching Ow DCLN, Y L (2016) Prediction of forex trend movement using. 2(2):117–140
Li Y, Xie Y, Yu C, Yu F, Jiang B, Khushi M (n.d.) Feature importance recap and stacking models for forex price prediction
Lin H, Sun Q, Chen SQ (2020) Reducing exchange rate risks in international trade: A hybrid forecasting approach of CEEMDAN and multilayer LSTM. Sustain (Switzerland) 12(6):1. https://doi.org/10.3390/su12062451
Article Google Scholar
Liu H, Motoda H (2001) Data Reduction via Instance Selection. In: Instance Selection and Construction for Data Mining. p. 3–20. https://doi.org/10.1007/978-1-4757-3359-4_1
Markovitch S, Rosenstein D (2002) Feature generation using general constructor functions. Mach Learn 49(1):59–98. https://doi.org/10.1023/A:1014046307775
Article MATH Google Scholar
Mitra SK (2012) Is Hurst exponent value useful in forecasting financial time series? Asian Soc Sci 8(8):111–120. https://doi.org/10.5539/ass.v8n8p111
Article Google Scholar
Montgomery DC, Jennings CL, Kulahci M (2015) Introduction time series analysis and forecasting. Wiley, p 671.
Munkhdalai L, Munkhdalai T, Park KH, Lee HG, Li M, Ryu KH (2019) Mixture of activation functions with extended min-max normalization for forex market prediction. IEEE Access 7:183680–183691. https://doi.org/10.1109/ACCESS.2019.2959789
Article Google Scholar
Nagpure AR (2019) Prediction of multi-currency exchange rates using deep learning. Int J Innov Technol Explor Eng 8(6):316–322
Google Scholar
Ni L, Li Y, Wang X, Zhang J, Yu J, Qi C (2019) Forecasting of forex time series data based on deep learning. Procedia Comput Sci 147:647–652. https://doi.org/10.1016/j.procs.2019.01.189
Article Google Scholar
Pang S, Song L, Kasabov N (2011) Correlation-aided support vector regression for forex time series prediction. pp 1193–1203. https://doi.org/10.1007/s00521-010-0482-5
Petropoulos A, Chatzis SP, Siakoulis V, Vlachogiannakis N (2017) PT US CR. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2017.08.011
Article Google Scholar
Philip AA (2011) Artificial Neural Network Model for Forecasting Foreign Exchange Rate 1(3):110–118
Google Scholar
Preeti BR, Singh RP (2019) Financial and non-stationary time series forecasting using LSTM recurrent neural network for short and long horizon. In: 2019 10th International conference on computing, communication and networking technologies, ICCCNT 2019, 1–7. https://doi.org/10.1109/ICCCNT45670.2019.8944624
Primananda SB, Isa SM (2021) Forecasting gold price in rupiah using multivariate analysis with LSTM and GRU neural networks. Adv Sci Technol Eng Syst J 6(2):245–253. https://doi.org/10.25046/aj060227
Putra ARP, Permanasari AE, Fauziati S (2017) I forex trend prediction technique using multiple indicators and multiple pairs correlations DSS: a software design. In: Proceedings of 2016 8th International Conference on Information Technology and Electrical Engineering: Empowering Technology for Better Future, ICITEE 2016. https://doi.org/10.1109/ICITEED.2016.7863248
Putri KS, Halim S (2020) Currency movement forecasting using time series analysis and long short-term memory. Int J Ind Optim 1(2):71. https://doi.org/10.12928/ijio.v1i2.2490
Qian B, Rasheed K (2004) Hurst exponent and financial market predictability. In: Proceedings of the Second IASTED International Conference on Financial Engineering and Applications, pp 203–209
Qian B, Rasheed K (2010) Foreign exchange market prediction with multiple classifiers. J Forecast 29(3):271–284. https://doi.org/10.1002/for.1124
Article MathSciNet MATH Google Scholar
Qiu TYF, Yuan AYC, Chen PZ, Lee RST (2019) Hybrid Chaotic Radial Basis Function Neural Oscillatory Network (HCRBFNON) for financial forecast and trading system. In: 2019 IEEE Symposium Series on Computational Intelligence, SSCI 2019, September, 2799–2806. https://doi.org/10.1109/SSCI44817.2019.9002781
Qu Y, Zhao X (2019) Application of LSTM Neural Network in Forecasting Foreign Exchange Price. J Phys: Conf Ser 1237(4):1. https://doi.org/10.1088/1742-6596/1237/4/042036
Article Google Scholar
Ramadhani IJ, Rismala R (2016) Prediction of multi currency exchange rates using correlation analysis and backpropagation. 2016 International Conference on ICT for Smart Society, ICISS 2016, July, 63–68. https://doi.org/10.1109/ICTSS.2016.7792850
Ranjit S, Shrestha S, Subedi S, Shakya S (2018) Comparison of algorithms in foreign exchange rate prediction. In: Proceedings on 2018 IEEE 3rd international conference on computing, communication and security, ICCCS 2018, December 2020, 9–13. https://doi.org/10.1109/CCCS.2018.8586826
Reddy SK, B A, (2015) Exchange rate forecasting using ARIMA, neural network and fuzzy neuron. J Stock Forex Trad 04(03):1. https://doi.org/10.4172/2168-9458.1000155
Article Google Scholar
Resta M (2012) Send orders of reprints at bspsaif@emirates.net.ae Recent Patents on. In Computer Science (vol 5).
Rundo F (2019) applied sciences Deep LSTM with Reinforcement Learning Layer for Financial Trend Prediction in FX High Frequency Trading Systems
Raimundo M, Okamoto J Jr (2018) Application of Hurst Exponent (H) and the R/S Analysis in the Classification of FOREX Securities. Int J Model Optim 8(2):116–124. https://doi.org/10.7763/ijmo.2018.v8.635
Article Google Scholar
Saiful Islam M, Hossain E (2020) Foreign exchange currency rate prediction using a GRU-LSTM Hybrid Network. Soft Computing Letters. https://doi.org/10.1016/j.socl.2020.100009
Article Google Scholar
Samarawickrama AJP, Fernando TGI (2019) Multi-step-ahead prediction of exchange rates using artificial neural networks: a study on selected Sri Lankan foreign exchange rates. 2019 IEEE 14th International Conference on Industrial and Information Systems: Engineering for Innovations for Industry 4.0, ICIIS 2019 - Proceedings, 488–493. https://doi.org/10.1109/ICIIS47346.2019.9063310
Silva DA, Dylan M, Tiago D (2021) Forex price prediction using LSTM ’ s
Shah V, Parikh K (2018) Exploring the predictability of different asset class using exponents in multifractal analysis
Tealab A, Hefny H, Badr A (2017) Forecasting of nonlinear time series using ANN. Fut Comput Inf J 2(1):39–47. https://doi.org/10.1016/j.fcij.2017.05.001
Article Google Scholar
Ulina M, Purba R, Halim A, Putri KS, Halim S (2020) Foreign exchange prediction using CEEMDAN and improved FA-LSTM. Int J Ind Optim 1(2):71. https://doi.org/10.12928/ijio.v1i2.2490
Wang H, Ma C, Zhou L (2009) A brief review of machine learning and its application. Proceedings - 2009 International Conference on Information Engineering and Computer Science, ICIECS 2009. https://doi.org/10.1109/ICIECS.2009.5362936
Weerathunga HPSD, Silva ATP (2018) DRNN-ARIMA approach to short-term trend forecasting in forex market. 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer), pp 287–293
Wei W, Li P (2019) Multi-channel LSTM with different time scales for foreign exchange rate prediction. ACM Int Conf Proc Ser. https://doi.org/10.1145/3373477.3373693
Article Google Scholar
Vyklyuk Y, Darko Vuković AJ (2013) Forex prediction with neural network: Usd/Eur. Actual Problems Econ 10(10):251–261
Google Scholar
Yıldırım DC, Toroslu IH, Fiore U (2021) Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators. Financ Innov 7(1):1–36. https://doi.org/10.1186/s40854-020-00220-2
Article Google Scholar
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224
MathSciNet MATH Google Scholar
Yu-Liu04.dvi _ Enhanced Reader.pdf. (n.d.).
Zanc R, Cioara T, Anghel I (2019) Forecasting financial markets using deep learning. In: Proceedings - 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing, ICCP 2019, September, pp 459–466. https://doi.org/10.1109/ICCP48234.2019.8959715
Zhang B (2018) Foreign exchange rates forecasting with an EMD-LSTM neural networks model. J Phys: Conf Ser 1053(1). https://doi.org/10.1088/1742-6596/1053/1/012005
Zhang K, Jiang Y, Liu D, Song H (2020) Spatio-temporal data mining for aviation delay prediction. In: 2020 IEEE 39th international performance computing and communications conference, IPCCC 2020. https://doi.org/10.1109/IPCCC50635.2020.9391561
Zhao Y, Khushi M (2020) Wavelet Denoised-ResNet CNN and LightGBM method to predict forex rate of change. In: IEEE International Conference on Data Mining Workshops, ICDMW, 385–391. https://doi.org/10.1109/ICDMW51313.2020.00060
Zhelev S, Avresky DR (2019) Using LSTM neural network for time series predictions in financial markets. 2019 IEEE 18th International Symposium on Network Computing and Applications. NCA 2019:1–5. https://doi.org/10.1109/NCA.2019.8935009
Article Google Scholar
Zhou T (2020) Forex trend forecasting based on long short term memory and its variations with hybrid activation functions. https://bura.brunel.ac.uk/handle/2438/20942

Download references

Acknowledgements

I would want to thank everyone who contributed in any way, as well as all of the researchers that supported me with my research; their research papers provided me with a lot of inspiration.

Funding

No funding was received from any source.

Author information

Authors and Affiliations

Department of Computer Science and Informatics, School of Sciences, University of Energy and Natural Resources, Sunyani, Ghana
Michael Ayitey Junior, Peter Appiahene & Obed Appiah

Authors

Michael Ayitey Junior
View author publications
You can also search for this author in PubMed Google Scholar
Peter Appiahene
View author publications
You can also search for this author in PubMed Google Scholar
Obed Appiah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A.J. obtained the data AUD/USD, EUR/AUD, AUD/JPY from FXpro and performed the Hurst exponent on AUD/USD to determine if it is trending or not. P.A. performed the Pearson’s correlation analysis to obtain the coefficients and was a major contributor in writing the manuscript. O.A. wrote the python codes for the implementation of the two-layer stacked Long Short-Term Memory Neural network. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Michael Ayitey Junior.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ayitey Junior, M., Appiahene, P. & Appiah, O. Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis. Journal of Electrical Systems and Inf Technol 9, 14 (2022). https://doi.org/10.1186/s43067-022-00054-1

Download citation

Received: 03 February 2022
Accepted: 23 May 2022
Published: 30 June 2022
DOI: https://doi.org/10.1186/s43067-022-00054-1

Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis

Abstract

Introduction

Literature review

Artificial neural network

Long short-term memory neural network

Stacked long short-term memory networks

Hurst exponent

Correlation analysis

Pearson’s correlation

Related research

Methodology

Data collection

Features of the dataset

Hurst exponent

Two-layer stacked LSTM

Pearson’s correlation

Decision making

Evaluation metrics

Mean Square Error

Root Mean Square Error

Mean Absolute Error

Analysis and discussion

Data scaling: MinMaxScaler

Results and discussion

Hurst exponent

Two-layer stacked LSTM

Correlation analysis

Conclusion and recommendation

Conclusions

Recommendations

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords