DEEP LEARNING LSTMs: Target predict the price of an Equity (not the trend)

LSTMs (a type of machine learning algorithm) may not always provide completely accurate results when analyzing price data that does not meet all of the algorithm's requirements. While LSTMs can still be helpful, it's important to be aware that there are some limitations to these models. To improve their performance, it's recommended to adjust certain hyperparameters (settings that can be tweaked to optimize the model) using a technique called cross-validation, rather than adjusting them manually.

We will try to improve prediction without falling into overfitting

  • Testing features
  • Modifying hyperparameters: Number of neurons, Activation functions like sigmoid, relu, tanh, softmax and Dropout ratio,
  • Hyperparameterize with cross-validation (extensive high computer power consuming).
  • Stationarity tests.

  • We select EURUSD

    Look for abnormal returns

    We split the data into TRAIN and TEST, and then we scale it.

  • Data divided into neg(training) 70.0% / 684 sessions and Test 30.0% / 293 sessions.
  • Length: X_train 684 data, from 2020-04-19 to 2022-05-24
  • Length: X_test 293 data, from 2022-05-25 to 2023-04-19
  • Length: y_train 684 data, from 2020-04-19 to 2022-05-24
  • Length: y_test 293 data, from 2022-05-25 to 2023-04-19
  • For the train data we have created a history where each example (batch) consists of timestep (for example, 20 days).
  • Dimensions of train data ### Creating the input and target features
  • The first 664 is the history of batch groups of 20 days each time
  • The second 20 is the timestep, 20 days each time
  • The third 5 is the number of columns or features, OHLC and Vol
  • We move on into the LSTM (Long Short Term Memory) model: Model training

    Epoch 1/100; 54/54 [==============================] - ETA: 0s - loss: 0.2595
    Epoch 1: val_loss improved from inf to 0.06297, saving model to LSTM_weights_best.hdf5
    54/54 [==============================] - 9s 63ms/step - loss: 0.2595 - val_loss: 0.0630
    Epoch 2/100 ; 54/54 [==============================] - ETA: 0s - loss: 0.0550
    Epoch 2: val_loss improved from 0.06297 to 0.01917, saving model to LSTM_weights_best.hdf5
    54/54 [==============================] - 2s 45ms/step - loss: 0.0550 - val_loss: 0.0192
    ...
    Epoch 99/100; 54/54 [==============================] - ETA: 0s - loss: 1.3925e-04
    Epoch 99: val_loss did not improve from 0.00007
    54/54 [==============================] - 2s 37ms/step - loss: 1.3925e-04 - val_loss: 8.2696e-05
    Epoch 100/100; 54/54 [==============================] - ETA: 0s - loss: 1.2493e-04
    Epoch 100: val_loss did not improve from 0.00007
    54/54 [==============================] - 2s 40ms/step - loss: 1.2493e-04 - val_loss: 1.0062e-04

    Model Loss

    Close price prediction

    Model Loss

    RMSE - Unacceptable Error: The RMSE (root mean square error) of the model is 0.06.

    R2 - Unacceptable Model: The R-squared statistic is unable to explain the variation in the test dataset. The model exhibits low reliability.

    Simple strategy: based on prediction and real price

    · +1 if predicted price > real price\ · -1 if predicted price <= real price


    Risk

    This is a strategy that simply uses the predicted price and compares it with the real price, and there is always a position.

    Number of trades: 2. Here are the first:

    More complex strategy:

    Based on mean reversion after exceeding "s" std upwards or downwards (when s=2 it's the Bollinger Bands)

    Note: It would be interesting to run a stationarity test (not to be confused with seasonality) for mean reversion (we have the codes, it won't take more than a couple of hours to update it). This way, the assumption of mean reversion would have more scientific support and increase the level of causality and theorization.

    Theory: The underlying theory, the basis of causality, is that the spread between the predicted and actual price exhibits mean reversion. This seems clear, although it needs to be tested. The question is whether this mean reversion of the spread is reflected in real price reversals. In that case, we are detecting tops and bottoms through LSTM.

    This is a mean reversion strategy based on the spread between predicted and actual prices, which demonstrates mean reversion behavior. The question is whether this mean reversion in the spread is reflected in price reversals. In that case, we are using LSTM to detect price tops and bottoms.

    Number of trades: 39. Here are the first ones: