Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis

Since it is one of the world's most significant financial markets, the foreign exchange (Forex) market has attracted a large number of investors. Accurately anticipating the forex trend has remained a popular but difficult issue to aid Forex traders' trading decisions. It is always a question of how precise a Forex prediction can be because of the market's tremendous complexity. The fast advancement of machine learning in recent decades has allowed artificial neural networks to be effectively adapted to several areas, including the Forex market. As a result, a slew of research articles aimed at improving the accuracy of currency forecasting has been released. The Long Short-Term Memory (LSTM) neural network, which is a special kind of artificial neural network developed exclusively for time series data analysis, is frequently used. Due to its high learning capacity, the LSTM neural network is increasingly being utilized to predict advanced Forex trading based on previous data. This model, on the other hand, can be improved by stacking it. The goal of this study is to choose a dataset using the Hurst exponent, then use a two-layer stacked Long Short-Term Memory (TLS-LSTM) neural network to forecast the trend and conduct a correlation analysis. The Hurst exponent (h) was used to determine the predictability of the Australian Dollar and United States Dollar (AUD/USD) dataset. TLS-LSTM algorithm is presented to improve the accuracy of Forex trend prediction of Australian Dollar and United States Dollar (AUD/USD). A correlation study was performed between the AUD/USD, the Euro and the Australian Dollar (EUR/AUD), and the Australian Dollar and the Japanese Yen (AUD/JPY) to see how AUD/USD movement affects EUR/AUD and AUD/JPY. The model was compared with Single-Layer Long Short-Term (SL-LSTM), Multilayer Perceptron (MLP), and Complete Ensemble Empirical Mode Decomposition with Adaptive Noise–Improved Firefly Algorithm Long Short-Term Memory. Based on the evaluation metrics Mean Square Error (MSE), Root Mean Square Error, and Mean Absolute Error, the suggested TLS-LSTM, whose data selection is based on the Hurst exponent (h) value of 0.6026, outperforms SL-LSTM, MLP, and CEEMDAN-IFALSTM. The correlation analysis conducted shows both positive and negative relations between AUD/USD, EUR/AUD, and AUD/JPY which means that a change in AUD/USD will affect EUR/AUD and AUD/JPY as recorded depending on the magnitude of the correlation coefficient (r).


Introduction
Forex (foreign exchange) is a worldwide, unregulated, and extremely stable market for trading currency pairings [1]. Speculating on the overall power of one currency versus another involves FOREX Trading [62]. The liquidity of the FX market exceeds that of the stock market, per the 2019 Triennial Central Bank Survey of FX and Over-the-Counter (OTC) Derivatives Markets, with a daily turnover of $6.6 trillion [6]. It is unique among financial markets in that it is open 24 h a day, five days a week, apart from on weekends. Unlike stocks, the forex market is one of the most sophisticated markets due to its high volatility, highly nonlinear, and inconsistency [1]. The features of Forex are distinct from those of the stock, bond, and other money markets. Because of these distinctions, forex traders have additional trading options to benefit from. The advantages of trading in the forex market include no commissions, no middlemen, no minimum lot size, low transaction costs, abundant liquidity, nearly instantaneous transactions, low margins or high leverage, and activities that are available 24 h a day. The entire market is divided into four overlapping sessions that take place in separate time zones, with the following hours stated in GMT + 1 time: London (8 am-4 pm); New York (1 pm-9 pm); Sydney (10 pm-6 am); Tokyo (midnight-8 am) [9]. Online trading is available, and insider trading is prohibited with few regulations [63]. The majority of research in foreign currency forecasting focuses on well-known markets [16]. The forex market is dominated by retail traders, Central banks, commercial banks, investment firms, and hedge funds. Individuals and institutions are motivated to participate in the Forex market by their financial activities and economic demands, according to [33]. Large investors utilize the FX market to manage their portfolios and prevent exchange rate risk. The FX market is mostly used by retail traders to benefit from short-term currency rate changes. Fundamental and technical analysis are the two most prevalent approaches to evaluate the currency market. Fundamental analysis uses forex news to forecast market trends, including inflation, interest rates, and economic growth [1]. Technical analysis, on the other hand, relies on historical data price charts to give a blueprint for past price action. A technical analyst predicts the future by looking at the past. The much more crucial decision in the Forex market is anticipating the trend of a currency pair's movement. Correctly predicting currency fluctuation can provide a lot of advantages to traders and vice versa. The availability of data, the rise in computing power, and the popularization of machine learning algorithms according to Gonz and Herman [18] have increased the use of technical analysis in forecasting the forex market. While a lot of machine learning and deep learning methodologies are employed in finance, traders are constantly seeking fresh techniques to outsmart the market [16]. Financial time series are particularly noisy in a time series modeling method because their fluctuations might be influenced by factors that are difficult to measure and precisely define in a way that allows them to be used as independent variables [37]. Because of the multiple uncertainties, nonlinearities, and non-Gaussian disturbances present in forex prices, forex prices are particularly difficult to analyze [71]. Tough prediction problems in a range of fields may be solved with deep neural networks [15]. A key element of their effectiveness is their capacity to identify abstract characteristics from raw data. A hybrid of the Gated Recurrent Unit and Long Short-Term Memory (GRU-LSTM) has been used to build a deep learning network framework by Saiful Islam and Hossain [53], to predict the forex market. Deep learning has shown to be a game-changing technology in a range of industries, including pictures, audio, and natural speech processing, and it is now being used to track price movements in the financial sector. Zhao and Khushi [69] analyzed the Forex premises as a digital image, retrieving visual representation with convolutional neural networks (CNN) and subsequent processing with a Light gradient boosting machine (LightGBM).
In the forex market, the most important decision is whether to buy or sell based on your forecasted direction. In this case, the Hurst exponent is used to assess whether our dataset is trending or not before subjecting it to the two-layer stacked LSTM model. The Hurst exponent is indeed a numerical metric used to categorize time series, according to Qian and Rasheed [43], and hence provides a prediction parameter. The forecasting is based on a daily timescale dataset as input, which may be useful for trend traders. Trend trading is a method of attempting to capture profits by analyzing a currency pair in a specific direction. When the currency pair is heading upward, trend traders take a long position, and when the pair is trending downward, trend traders take a short position. For trend trading, a trader stays in the market longer in other to realize a profit. As a result, anticipating the direction of a currency pair's movement, as well as the correlation between the forecasted pair and other currencies in the market, is crucial. Correlation is a statistical term that reflects the degree and direction of a link between two random variables [36]. The correlation analysis is performed to inform the trader as to how to trade another currency pair based on the direction of the predicted pair. Pearson's correlation analysis is performed between the predicted pair and two other currency pairs to find out the level of correlation between them. Only a tiny fraction of the world's official currencies are traded on the foreign exchange market, only eight currencies, notably the US dollar, account for more than 95 percent of all global Forex operations in a single day. The United States dollar (USD, the Euro (EUR, the British pound (GBP, the Japanese yen (JPY, the Swiss franc (CHF, the New Zealand dollar (NZD, the Australian dollar (AUD, and the Canadian dollar (CAD (CAD the most commonly utilized currencies. Since these currencies are linked with stable political, well-regarded central banks, and stable prices, such liquidity balance is acceptable. The three commodities pairings, AUD/USD, USD/CAD, and NZD/USD, are followed by the four majors, namely EUR/USD, USD/ JPY, GBP/USD, and USD/CHF. For these market liquidity issues, FX portfolio trading frequently focuses on this group of seven major currency pairs, with the addition of a couple of other currencies belonging to European Union region states, sometimes for more than a decade, according to FX portfolio trading [37]. According to Vyklyuk and Vuković [60], the most popular currency pair with a 28 percent share of the entire FX market is USD/EUR, followed by USD/JPY with a 14 percent take and USD/GBP aren't far behind with 9 percent. The AUD/USD pair, popularly known as the "Aussie," shows traders how many US dollars (the quotation currency) are required to buy one Australian dollar (the base currency). The other two currency pairs EUR/AUD and AUD/JPY were chosen to perform Pearson's correlation analysis between them and AUD/USD. The correlation was performed with the Closing price feature of each dataset. This analysis gives a value that determines the degree of linear relationship that exists between the mentioned pairs, how the directional movement in AUD/USD will impact EUR/ AUD and AUD/JPY. To make a good profit from the forex market, we pegged our correlation value at ±0.80 which means that a change in AUD/USD will see a change in EUR/ 1. A comprehensive and detailed analysis of selecting Forex dataset based on Hurst exponent 2. The study also proposed a conceptual framework for forecasting the Forex market using two-layer stacked LSTM 3. The study also performed a correlation analysis between different currency pairs.
The remaining sections of the current paper are structured as follows. "Literature Review" delve deeper into time series and artificial neural network. "Methodology" section presents the conceptual framework, methods and how data were collected and analyzed. "Analysis and Discussion" section presents the results and a detailed discussion of the outcomes. "Conclusion and Recommendation" section presents the summary of findings and future studies that can be done.

Literature review
A time series is a chronological or time-oriented set of observations on a variable of interest [32]. Time series analysis is essential because it can be used to estimate a population's future potential and is frequently utilized in actual situations as an illustration of a country's population expansion based on an appraisal of its current population [42]. This time series prediction predicts that occurrences that happened in the past will reoccur, based on the findings of the time sequence plot. Activities in the foreign exchange market can be classified as a time series event. Because it operates as a conduit for international money, allowing worldwide business and investment, the foreign exchange market (FOREX) has become a significant institution [17]. Among the world's most significant financial markets is the FOREX market. It is an investment marketplace for the global exchange rate where currencies can be sold or bought on the forex market [8]. The FOREX market, according to Geromichalos and Jung [17], is an over-the-counter market with high bid-ask spreads and intermediation. The difference between the prices at which a dealer will buy and sell a currency is known as the bid-ask price or spread, and it varies from one broker to the next in the retail market. [12] used quotations from an FX dealer to look at the link between order sizes and spreads in the foreign currency (FX) market, and it was concluded that in the inter-dealer market, spreads are unrelated to order sizes; however, in the customer market, they are inversely related.

Artificial neural network
Artificial neural network attempt to mimic human mastering competencies employing modeling brain neurons through the use of computer models. Nerve cells that make up the nervous system's cortical layer are neurons [49]. The architecture and interaction between neurons in neural networks are separated into two groups: feedforward networks and feedback (recurrent) networks. As a static network, feedforward is a collection of linked neurons that represent a nonlinear function of its inputs. Only forward flow of information occurs, from inputs to outputs [57]. Neural networks are trained to lower the loss function's value, which assesses the complete difference between the model's input and the real label. Recurrent neural networks (RNN) are a valuable approach in deep learning. These models closely resemble how humans learn and digest information.

Long short-term memory neural network
Sepp Hochreiter and Jurgen Schmidhuber first proposed the LSTM model in 1997. It's a recurrent neural network with the capability for both long and short-term memory. According to Ulina et al. [58], LSTM cells typically include four layers, three of which are "gates" that allow information to pass through optionally. The three gates that are frequently employed are the forget, input, and output gates. In the forget gate layer, the model decides what information from prior states to keep. Figure 1 represents the structure of a typical LSTM [3]. The model chooses which values to update at the input gate layer. Finally, the cell state's final output is determined by the output gate layer. The ability of this network to facilitate sequential learning is due to its memory cell and longterm dependence processing capability [23].
σ is the activation function that detects states from the previous phase and produces some f t between 0 and 1 to govern the amount of information flow that remains. i t is the input gate that is coupled with the candidate value, c t after tanh layer to update the state, then the old cell state c t which represents long-term memory, can be replaced by the new one. The multiplication of O t and tanh(c t ) yields h t utilized as the input for the subsequent LSTM cell.
Dobrovolny et al. [13] used Long Short-Term Memory neural network to anticipate the EUR/USD price, evaluating and proposing the best time block for predicting based on daily FOREX data, and concluding single LSTM layer performs well.
Assessing the subsequent behavior of a currency is a primary concern for governments, financial institutions, and investors, according to Escudero et al. [14], using this type of analysis to understand a country's economic situation and determine when to sell and buy goods or services from that country. This sort of time series is forecasted with acceptable accuracy using a variety of models. However, getting strong predicting performance is difficult due to the unpredictable character of these time series. Wei and Li [61] developed a Multi-Channel LSTM network for predicting foreign exchange rates utilizing data from different time scales. Features gleaned from several channels give complimentary data, which is used to forecast foreign exchange changes.

Stacked long short-term memory networks
Stacked LSTM is a variation of a single hidden Layer LSTM model, it contains numerous buried LSTM layers, each having many memory cells [15]. A much more accurate term for stacked LSTM is "deep learning," since the model gets progressively sophisticated due to the numerous hidden layers. The complexity of neural networks is typically credited with their efficiency on a wide range of difficult prediction tasks. Stacked LSTMs have now been validated for tough sequence prediction issues. This is because it is a deep learning recurrent neural network that outperforms deep feedforward neural networks in terms of model parameter efficiency, converges quickly, and efficiently uses model parameters [16]. Further to this explanation is Fig. 2 which is a stacked LSTM architecture that has many LSTM layers [68].

Hurst exponent
Harold Edwin Hurst was a hydrologist who devoted nearly his entire career to combating reservoir control concerns in Egypt [50]. He looked at how the reservoir's range moved around its average level, assuming that subsequent influxes were random (i.e., statistically independent), and the range would grow in lockstep with time [50]. Hurst created a dimensionless statistical exponent by dividing the corrected range by the standard deviation of the observations to discover support from the Nile River data. Rescaled range analysis (R/S analysis) is the name given to this method [50]. The Hurst exponent is a measure of fractality and long-term memory in a time series [43]. The first question we seek to address in time series forecasting is whether the time series under consideration is predictable. If the time series is random, all approaches are likely to fail, thus identifying time series with some predictability is key [43]. Since it is resilient and makes minimal guesses about the underlying system, it is useful for time series analysis. The severity of this tendency increases as H approaches 1.0, indicating that time series with a large H are more predictable than those with a small H. The Hurst exponent is a statistic that can offer information on correlation and persistence in a time series, according to Raimundo and Okamoto Jr [52]. The Hurst exponent does not remain constant throughout time, but it does fluctuate [31].

Correlation analysis
Correlation is a statistical measure of the relationship between two effects in finance, with correlation coefficients ranging from −1 to + 1. A perfect negative correlation is −1, a perfect positive correlation is + 1, and there is no correlation between the two variables is zero. On the one hand, some currencies tend to move in the same direction, while others do not [47]. For someone who trades many currency pairs, this is essential knowledge. As a consequence, the trader is ready to take steps to hedge, diversify, or capitalize on a double position benefit as a result of the event. Pearson's correlation, Kendall rank correlation, Spearman correlation, and the Point-Biserial correlation are the four types of correlations that are commonly assessed in statistics [41]. By comparing the positive and negative trends, the strength is estimated. Traders utilize longer and multiple time frames to anticipate future market moves with Matrix Connection for Indicator Setup [19].

Pearson's correlation
The Pearson's product-moment correlation coefficient is a dimensionless indicator that remains unchanged when either variable is linearly transformed. In 1895, Pearson devised the mathematical formula for this crucial metric: this or a basic algebraic variant of it is the most popular formula in introductory statistics textbooks. The numerator is centered by subtracting the mean of each variable from the raw scores, and the sum of cross-products of the centered variables is obtained. According to Rodgers and Wander [25], the denominator balances the scales of the variables so that they have the same number of units. The equation defines correlation (r) as the centered and normalized sum of the cross-products of two variables. Nagpure [34] used Pearson's correlation to calculate the correlation coefficient for the currencies USD, EUR, JPY, GBP, AUD, CAD, CHF, CNY, SEK, NZD, MXN, and INR using data from 30 to 39 years up to December 2018. The result is a matrix with the correlation coefficient for each currency pair. Currencies are linked and have an impact on one another. Currencies move in the same direction most of the time during the observed interval when they have a strong positive correlation (close to 1) and in opposite directions most of the time during the observed interval with a strong negative correlation (close to −1) [70].

Related research
Zanc et al. [66] developed a conceptual design for an Intelligent Trading System based on a stacked LSTM architecture module and centered on a Financial Forecasting Module. They tested the proposed LSTM-based Forecasting Module using datasets containing the evolution of the forex and cryptocurrency financial markets, finding that even if the forecasted values' error is low, the forecast cannot help the Intelligent Trading System because it is just a shifted filtered version of the actual price. The Chinese Yuan (CNY) exchange rate is difficult to anticipate because of its complex linking structure. This includes market-level coupling resulting from interactions with many financial markets, macro-level coupling resulting from interactions with economic fundamentals, and deep coupling resulting from the combination of the two types of couplings. This paper proposes DC-LSTM, a unique deep coupled Long Short-Term Memory (LSTM) technique for forecasting the USD/CNY exchange rate that captures the complicated couplings. To model the intricate connections, this approach uses a deep structure made up of stacked LSTMs. The suggested technique beats seven existing benchmarks, according to experimental results based on 10-year data. Through a profitability debate, the DC-LSTM has been proven to be a beneficial tool for making prudent investment decisions. The goal of this study is to explain why coupling learning is important for exchange rate forecasting and why a deep coupled model is effective for capturing the couplings [7]. Wei and Li [61] present a unique Multi-Channel LSTM network for foreign exchange rate prediction using time series data from various time scales. Features derived from various channels give complimentary data, which is combined to forecast foreign exchange changes. To analyze the model's statistical and financial performance, they used real market data for EURUSD and EURAUD. The Multi-Channel LSTM model outperforms baseline models in statistical measures and provides a dependable trading strategy in terms of profitability, according to the results. For a successful monetary policy, the capacity to accurately forecast exchange rates is critical. When trained on input characteristics properly constructed by domain knowledge professionals, machine learning methods such as shallow neural networks provide superior forecast accuracy than time series models [16]. Stacked Long Short-Term Memory (LSTM) is a deep recurrent neural network that utilizes model parameters well, accumulates swiftly, and performs better than deep feedforward neural networks.
Deep neural network models such as convolutional neural networks (CNN), RNN, Transformers, and others, according to Li et al. [24], outperform ordinary machine learning in many prediction tasks. Deep neural network (DNN)-based financial forecasting research, however, still has potential for improvement due to the irregularity, ambiguity, and volatility of financial markets, as well as the complexity of financial time series data. The study examines 15-min timeframe (M15) historical data for the AUDUSD and EURUSD pairings to estimate future highs and lows within five timestamps. The "model stacking" method is as follows: As candidate models, XGBoost, Random Forest, LightGBM, LSTM, and GRU were chosen: reviewing the results after applying the model stacking approach to every subgroup of eligible models; As the final pieces of stacking, the most effective model subset was identified. Models that are stacked outperform single models according to the findings.
Lin et al. [28] proposed a new hybrid model for Multilayer Long Short-Term Memory (MLSTM) networks based on complete ensemble empirical mode decomposition (CEEMDAN). It addresses the drawbacks of prior approaches. Using ensemble empirical mode decomposition, data were rebuilt with less computation (EEMD). They compared the suggested strategy to numerous mainstream approaches or other hybrid models to obtain an objective evaluation. In a range of test evaluations, CEEMAN-MLSTM beats many well-known and acknowledged models, according to the findings of the study. The MLSTM model can learn more complex dependencies from exchange rate data than the standard model, and it can even learn temporal sequences. Many tests have been carried out to assess the proposed system's performance (Fig. 3).

Methodology
The methodology seeks to ensure the selection of a good dataset using Hurst exponent, the efficiency of two-layer stacked Long Short-Term Memory Neural Network to forecast the trend, and correlation analysis to look at the linear relationship between currency pair, as described in the study objectives. This study's methodology is an adaption of a traditional machine learning problem-solving approach that follows the conceptual framework as proposed. The basic goal of this architecture or framework is to satisfy all of the proposed work's implementation needs and analyze the relevance, topic matter, and context of the problem at this stage to acquire a better knowledge of it. One major gap that was identified during an extensive literature review is that a lot of researchers arbitrarily chose dataset without any clear reason; secondly, we also realized that LSTM has been extensively used in Forex forecasting, but there was the need to increase its performance by adding another layer to the single layer; last but not least, we also realized that the movement in one currency affects another just as a change in one economy affects another, hence the need to look at how a change in one currency affects another. Numerous modern machine learning platforms allow researchers to test and run various models on diverse quantities of data. One of these computing platforms with strong processing capabilities is chosen for the study after careful analysis of the research objectives. Platforms with extensive machine learning and data science libraries, as well as significant online computing power, should be given special consideration. Because machine learning implementations are usually computationally intensive, an online platform that connects to supercomputing platforms for machine learning is required.

Data collection
A fundamental problem in the machine learning research paradigm is obtaining and using correct and clean data for both the training and testing phases of the process. Null columns or entries, as well as other noisy characteristics, are trimmed out during this session to make the data clearer. Several data cleaning processes are utilized depending on the scenario. To learn more about how to cope with missing data, go to [5]. According to the scientific community, these tactics offer both benefits and drawbacks. Instance selection, for example, may be used to deal with noise and to overcome the difficulties of learning with large datasets. In an essence, instance selection is a problem of optimizing mining quality while reducing sample size [29]. There are a variety of approaches that may be investigated, including selection. A method for finding and deleting duplicate and unnecessary properties is feature subset selection [64]. The dimensionality of the dataset is decreased, allowing for faster processing and more efficiency. When many traits are unduly dependent on one another, a phenomenon occurs. Feature transformation or construction could be utilized to tackle this problem. These newly created characteristics might lead to more concise and accurate information [30]. Autoencoders or label-encoders could be used as part of data preparation to convert some features into categorical data formats for simple adaptation to the models. We used the daily data frame to provide a more trending view of the market without the usage of any technical indicators for a more reliable price movement. Our research employed daily (1D) exchange rates datasets for the AUDUSD, EURUSD, and AUDJPY collected from FXpro (broker) through the Metatrader 4 trading platform, which is the most extensively used trading platform among brokers according to [19]. The forecasting is done with AUDUSD. For the Pearson's correlation analysis, the close price of each dataset is retrieved. Each dataset has 1772 records with a period spanning from 2013/05/02 to 2020/03/01.

Features of the dataset
1. Date: the date on which price action happened in the market in this case over a 24-h period 2. Open: the starting contract price for the time. Since it is the consensus price after all interested parties have had time to "sleep on it," the open is especially important for reviewing daily data. 3. Close: the price of the most recent trade made within the period. For the most part, this price is considered in the analysis. Most specialists regard the relationship between the open and close price to be noteworthy. 4. High: during the period, this was the asset's highest price. It was at this stage that there were more sellers than purchasers. 5. Low: during the period, this is the asset's lowest price. It's when the number of customers outnumbers the number of sellers. 6. Volume: this is the total quantity of assets traded within the specified period. The relationship between pricing and volume is critical, for example. Price increases are accompanied by increases in volume [62].

Hurst exponent
The Hurst exponent is a statistic that may be used to detect if a time series is correlated and persistent [52]. Its value is used to examine whether or not financial time series are predictable [31]. Generally Hurst exponent has a mathematical output ranging from 0 to 1. According to literature, calculation of Hurst exponent of the time series can be in various ways, such as the range to standard deviation ratio which was adapted in this current study. The approach works by examining the average rescaled range of cumulative departures from a series' mean values [31]. The procedure for computing the R/S statistic of financial data series is as follows: In terms of the rescaled range, the Hurst exponent, H, is defined as follows: where the rescaled range is E R(n) S(n) . The anticipated value is E[x] . The time of the last observation (e.g., X n in the input time series data) is represented by n . C represents a constant.
The Hurst exponent is an autocorrelation metric (persistence and long memory). For a time, series, the rescaled range is determined.
as follows: 1. Calculate the mean: 2. Make a mean-adjusted series of data: 3. Calculate the cumulative deviate series Z using the following formula: 4. Make a R: range series.

Make a set of standard deviations R:
where 6. Determine the R/S (rescaled range series): 8. This is achieved by fitting a straight line to both sides' logarithms. H is the result of the line's slope (i.e., Hurst exponent estimates). 9. The approach above is known to yield a skewed estimate of the power-law exponent, and there is a departure from 0.5 slopes for a small data set (i.e., white noise). The white-noise theoretical value of the R/S statistics, according to Anis-Lloyd, is as follows: where Γ is the Euler Gamma Function. 10. While most Hurst exponent estimators have yet to be given an asymptotic distribution theory, an approximate functional form for the Anis-Lloyd-corrected R/S estimate is available. The functional form for confidence interval bounds for a 95 percent confidence interval is as follows: 11.

12.
where M = log 2 (N ) N = series length 13. Lastly, the R/S Hurst adjusted Anis-Lloyd exponent is calculated as 0.5 plus the slope of Hurst et al. [21] and Shah and Parikh [56] Two-layer stacked LSTM From Fig. 4, the LSTM network architecture [24] for numerical research consists of one input layer, two LSTM layers, and one output layer. One neuron is present in the output layer, whereas 128 LSTM cells are present in each of the first and second layers. The anticipated exchange rate x k is determined by h k−1 .  where σ is the activation function that detects states from the previous phase and produces some F t between 0 and 1 to govern the amount of information flow that remains. I k is the input gate that is coupled with the candidate value, c k after tanh layer to update the state, then the old cell state c k which represents long-term memory, can be replaced by the new one. The multiplication of O k and tanh(c k ) yields h k utilized as the input for the subsequent LSTM cell.

Pearson's correlation
Below are formulae for calculating r using Pearson's method: Every 'Close' feature from the AUDUSD, EURAUD, and AUDJPY datasets were extracted and used to calculate the correlation value between the x and y values of each pair.

Decision making
The correlation coefficient (r) generated is used to advise a potential trader to trade EUR/AUD and AUD/JPY depending on the degree of relation between them and AUD/ USD for high performance and making a good return we will encourage traders to trade on if the correlation coefficient (r) is ± 0.8 or better as stated in Fig. 3.

Evaluation metrics
It is necessary to employ evaluation metrics to obtain a decent model. In essence, a survey through available evaluation metrics could reveal the important components and factors that suit a particular dataset or problem. One of the most machine learning problems is regression problems and as a result, some metrics could be employed to evaluate predictions. Some are:

Mean Square Error
The Mean Square Error (MSE) or Mean Squared Deviation (MSD) of an estimator (of a process for estimating an unobserved variable) in statistics measures the average of the squared errors, or the average squared difference between the estimated values and the actual value. MSE is a risk function that shows how much a squared error loss is estimated to cost. Who are we to believe? There is a need for clarification. Due to randomness or the estimator's failure to account for information that may yield a more precise estimate, MSE is almost always strictly positive (rather than zero). Mathematically, it is as follows:

Root Mean Square Error
The residuals' standard deviation (prediction errors). The RMSE indicates how evenly distributed the residuals are, whereas residuals indicate how much the data points vary from the regression line. Put it another way, it shows how densely the data are packed around the best fit line. The Root Mean Square Error is commonly used to analyze experimental results in climatology, forecasting, and regression research. It may be quantitatively stated as follows:

Mean Absolute Error
The Mean Absolute Error (MAE) is a statistic for evaluating regression models. The average of all individual prediction errors on all occurrences in the test set is the Mean Absolute Error of a model when it comes to a test set. The discrepancy between the real and expected values for each occurrence is referred to as a prediction error. It may be expressed numerically as follows: and where the real and anticipated values of the data are y i , and ŷ i , respectively, and n is sample size [24].

Analysis and discussion
To get the desired results, the chapter executes and tests the carefully established method. While fine-tuning the data cleaning technique, the comprehension of the AUD/USD dataset under consideration is taken into account so that it becomes optimal clean data for the underlying approach. Collab, Google's open-source and online machine learning platform, is used for a range of purposes, including research that requires a large amount of computer capacity and a large number of machine learning libraries. It allows machine learning professionals to create and run python programs on Google's cloud servers with little setup. With features like simple sharing and free access to online Graphics Processing Units, this platform is an excellent fit for our project (GPUs). It incorporates the important machine learning languages listed below, making the decision even easier.
With so many permutations to choose from, 75 percent of the dataset was randomly chosen for training, with the remaining 25% designated for testing. Furthermore, feature scaling is used to bring all features to the same magnitude level. This signifies that the data will be structured into a range of 0 to 1.

Data scaling: MinMaxScaler
MinMaxScaler scales all data features in the range [0, 1] or [−1, 1] if the dataset contains negative values. All inliers in the narrow range [0, 0.005] are compressed using this scale. The interpolate technique was used to handle missing data, and the minmaxscaling method was used to scale the data. Every part was mounted to a certain range, which changed the features. Each component is scaled and interpreted individually, and it is in the specified field on the training dataset. [66].
The conversion is given by: Unit variance scaling, which employs zero means, is usually replaced by this conversion. Because lower numbers are simpler for backpropagation to manage, the data must be scaled between 0 and 1.

Hurst exponent
By applying the Hurst exponent on our dataset, we had the assurance that our dataset was good for the forecasting (Fig. 5).
The Hurst exponent, H, has values ranging from 0 to 1, and a time series can be categorized into one of three groups based on its value: 1. A random series is indicated by H = 0.5 which is 'mean_rev'; 2. An anti-persistent series is shown by 0 < H < 0.5 is 'gbm'; 3. A persistent series is indicated by 0.5 < H < 1 'trending' . The 'mean reversion' property of anti-persistent series suggests that an upward trend is likely to be followed by a descending trend, and vice versa. The strength of 'mean reverting' grows as H approaches 0.0. A persistent series, on the other hand, is trend reinforcing, meaning that the direction of future values (up or down) will most likely be the same as the current values [44].
From the graph, the slope gives the Hurst exponent value, a dataset of 1772 records, and five features, and we had a Hurst exponent of is 0.6026 indicating that our dataset is trending and good for forecasting in the forex market. In our dataset, Closing price (29) MinMaxScaler feature was used as our target feature meaning that that is the final price of the day and the trend is determined based on the Closing price which is the dependent variable (Fig. 6).

Two-layer stacked LSTM
The stack's first LSTM layer has 128 hidden units and a rectified ('relu') activation function, while the second LSTM layer has 128 units and a rectified ('relu') activation function as well. Activation functions add nonlinearity into the model, allowing deep learning models to learn nonlinear prediction boundaries. One of the disadvantages of neural networks is their sensitivity to overfitting. Regularization is a technique for preventing overfitting. There are two basic techniques: dropout and early stopping. When we built our model, we picked a Dropout of 0.2 input at the time of model training. The number of times the training dataset is introduced to the neural network is determined by the epoch, which was set at 32. The batch size was set to 100, which represents the total number of training examples in one batch (number of units a dataset is divided into). Optimizers are methods or strategies for reducing losses by changing the parameters of a neural network, such as the weights and learning rate. By minimizing the function, optimizers are employed to address optimization issues. We chose Adam optimizer for our model because it has a lower training cost. Figure 7 shows the lines of true trend against the predicted indicating a downward trend of AUDUSD meaning a trader can take a sell position, and when the price continues to fall, the trader will make some gain from the market. Our proposed two-layer LSTM is compared with the baseline models based on the evaluation metrics.
From the values of MSE, RMSE, and MAE in Table 1, we can conclude that our suggested model outperforms the baseline models of MLP, LSTM, and CEEMDAN-IFA-LSTM. The first section of the table compares our chosen dataset AUD/USD from April 1, 2013 to December 30, 2020 to the baseline MLP and LSTM models; the bolded results clearly illustrate that the proposed model beats the baseline models. From Table 1, the proposed model is compared with the work of Ulina et al. [58], who proposed  Table 1 Correlation analysis Table 2 shows the results of Pearson's correlation analysis between the Closing prices of the datasets AUDJPY, AUDUSD, EURAUD. From the table, the r value as stated in Fig. 3 between AUDJPY and AUDUSD is 0.798979 indicating a positive relationship between these two pairs meaning an uptrend in AUDJPY will see an uptrend in AUDUSD and a downtrend in AUDUSD will see an uptrend in AUDJPY. Also, from the table, the r value between AUDUSD and EURUSD is −0.639265 showing that the relationship between these pairs is negative which means that a rise in AUDUSD will see a fall in EURUSD.   For positive correlation, the closer the value is to 1 shows how strong the positive correlation is, and for negative correlation the closer the value is to negative 1(−1) the better the relationship. Negative correlation indicates that when selling the one you must buy the other pair. Positive correlation implies that when selling one pair you must also sell the other pair and when buying one pair you must also buy the other. Since we pegged the magnitude of our correlation value at − + 0.80 we will not advise a trader to trade for example who is trading AUDUSD to trade AUDJPY and EURUSD because the magnitude of the correlation value did not meet our initial target since we are interested in a high and very good correlation that will increase our chances of making money from the market.

Conclusions
The paper offered a conceptual framework that is centered on a Forex forecasting module that is created utilizing Hurst exponent, two-layer stacked LSTM architecture and correlation analysis. Using AUDUSD datasets, we assessed the proposed frameworkbased forecasting module and ran a Pearson's correlation study with AUD/USD, EUR/ AUD, and AUD/JPY. Due to the considerable income and economic benefits, it brings, Forex forecasting is an appealing research field. Academics are interested in forex prediction since it is a tough time series topic. Numerous studies, including statistical and machine learning approaches, have been undertaken on currency forecasting. First, a complete literature review on currency forecasting was undertaken in this paper. For the forex market time series, the suitability of the proposed model over SL-LSTM, MLP, and CEEMDAN-IFA-LSTM was explored. The module outperformed MLP, single-layer LSTM, and CEEMDAN-IFA-LSTM in terms of forgetting, remembering, and updating information. The suggested framework is well suited to learning from experience to categorize, analyze, and predict time series with unpredictably long-time delays and boundaries between critical events. Time series prediction is the study of patterns that evolve the right reaction at a specific moment in time, and it is based not only on the present value of the observable but also on the value of the observable in the past. As a consequence, the outcomes of the study are encouraging. The proposed model beats the standard Single LSTM, MLP, and CEEMDAN-IFA-LSTM by reducing Mean Square Error, Root Mean Square Error, and Mean Absolute Error. This research shows that using Hurst exponent to select the dataset and adding another layer to the LSTM can outperform other models. The Hurst exponent is one of the most important algorithms in the financial world. It gave us a hint about our dataset, indicating that it was trending and good for predicting and that it should be trusted when making key decisions. The linear relationship between instruments is examined using correlation. A high correlation between currency pairs can have a significant influence on one another and can be used to advise traders to trade another currency based on what is happening with the other.

Recommendations
The following recommendations were made based on the findings of the study in order to further improve performance.