read 6 mts
One of the most difficult things to do is predicting how the stock market will perform. But how about using AI for that matter?
This article primarily focuses on understanding both the real problem statement and the problem statement being solved.
Here, we will predict the NIFTY prices to compare returns from the predictions. It is observed that while we get good results for predicting the prices, they do not translate into good returns. The real problem statement here is the returns, and the problem statement being solved is price prediction.
NIFTY price data collected from the year 2014 to 2017 is considered for training, while 2018’s data is used to perform the test. To train or predict the close price for each day, we consider the closing prices for the past 60 days. We can scale the prices using the min-max scaling technique. So, our train data is a 3-d matrix with a batch size of 917, time steps of 60, and input dimensions of 1. We are now training an LSTM model with the following architecture for predicting the prices:
To measure the model performance, we are rescaling both the train and the test predictions and comparing them with the actual close prices. Since this is a regression problem, we are looking at RMSE and MAPE metrics.
Below are the train and test metrics along with plots of Predictions vs the Actual Prices. As per the results and the plots, they can be useful for trading.
Generating Signals to Trade:
Now that we have the predictions, let us put them to test to find the actual returns that we would have had if traded based on them. To trade, we need signals as if we want to buy or sell.
We shall generate signals based on today’s close price and tomorrow’s prediction. We can use it to calculate the returns and hence, obtain the equity curve of the model. The equity curve is nothing but, a change in initial investment based on the daily returns. This will help us understand the value of investment daily.
We have two kinds of signals: 1 to stay on the upside
and –1 to stay on the downside.
Whenever our upside or downside signal matches the actual moves, we make positive returns in two ways with an assumption that we can make money if either of the direction prediction is correct.
Sample of the data frame of signals and the equity curve is given below:
The equity curve of both the train and test predictions shows that we ended up in negative returns.
Also shown below are the returns from the model.
If we observe closely, it is not just about the negative returns but also to deal with the uncertainty of not knowing how much would we gain or lose daily. To understand how bad the uncertainty was for the returns we had, we will use a measure called the Sharpe Ratio. It measures the mean returns with the standard deviation in the returns. Though the actual Sharpe Ratio also has a risk-free rate, we used the simplified version for this experiment purpose.
To understand Sharpe Ratio, let us consider fixed deposits. Since there are fixed amount returns for the given period, there is no uncertainty and the Sharpe Ratio is infinity for fixed deposits. Ideally, the higher the Sharpe Ratio, the better the model. Usually, a model with Sharpe Ratio 1 is considered as a good model. So, if the returns are satisfactory, we measure the Sharpe Ratio of it.
Below are the statistical and business metrics of the model:
We can conclude that though we have significant model results, they did not translate into required business results.
To understand what was missing from the model-building part, we further did two experiments. Let us see the results from these two models now with an assumption that tomorrow’s close price is today’s close price.
Below is the Test Predictions vs Actual Predictions plot, Equity Curve plot, business results, and the model results.
All the aspects seem to be performing well. It was good during this period and on NIFTY but, when tested on other periods and other stocks, the results were not as great.
Let’s do the next experiment of generating the signals at random.
Below is the Equity Curve plot, business results, and the model results:
Correlation between the actual and predicted prices, actual and predicted prices daily returns:
Looking at the correlations between the prices, we realise that predictions would have helped us. But with the kind of business metrics, it is the returns that we are interested in, and if check the correlations of daily returns between the actual and predicted prices, it gives us the reason why the model results did not translate into business results. Also, if we look at the MAPE of the daily returns vs the actual of train data, we will understand that we were on an average of 1.3 to 1.5 standard deviations of daily returns, which is a very huge error.
We are trying to predict prices for maximizing the profits without realizing that returns are real prediction problem. Now that we have understood returns are what we are looking for, what if we train the model with returns instead of prices?
Looking at the predictions, we cannot make use of it to trade, as the predictions are mean of daily returns on any day. This establishes the fundamental problem which is difficult to solve in predicting daily returns using regression kind of techniques.
What might help?
- Turning it into a classification problem
- Adding some more variables to explain the daily returns
- Stationarizing the variables that are input
- Some kinds of variables used for predicting are given below:
- Macro-economic data
- Industry-specific data
- Company-specific data
- Technical indicators