Overview
I took Google stock data from Yahoo finance from 2004 onwards, from it I took the open price each day. There were a total of 4533 rows. To do the prediction of next day price I am considering last 60 days of data. I put that in a array as the training parameters and today’s open price the the parameter we are going to predict.
For this I am using a algorithm “Long Short Term Memory”, but even a simple CNN with 2 dense layers will also be able to provide you with relatively good results
Code
Imports
Read the stock prices
Extract the Open price column – That’s all we need
We will predict last 180 days prices. So we will pull old data as training set and last 180 days will be predicted and compared
We need to scale this data (generally good practice, and brings all rows on par with each other)
All data is now between 0 and 1. No worries we will un-scale the predictions later on.
We will take last 60 elements and put them in a list as a training set “X”, and the current day opening price will be “y”. This yield a array of array which we are just reshaping in end. Remember X is the parameter and y is the value we are trying to predict.
Split the data in training set and test set. Training data will have 4293 rows and test data which will show how good the predictions are, is 180 rows
This is how we are building the model, using LTSM, but simple Dense would have also done for this purpose. The last row here is to train the model
This is the model getting trained – See the loss quickly dropped to near zero signifying a rapid learning.
In the end to 100 epochs
Showing how the loss dropped so fast. We could have stopped earlier, but OK
Predictions -We will use the 180 rows we saved to see what values we come up with. y_test is the actual data from last 180 days
Plotting our predictions next to actual values in last 180 days. The Blue line is the the prediction. And red line are the actual values. We un-scaled the predictions and actual values
Here the downward trend around day 80-110 is predicted properly and is not swayed by small gains couple of days, the stock is under pressure and those gains are not going to rescue it.
This scatter plot is another way to view the discrepancies, the approximate diagonal line shows we were actually good. Ideally on a perfect prediction, all dots should align perfectly diagonally.
What did we get.
Casinos run entire business on a small prediction / chance. They know the odds are stacked in favor of the house, whatever small they are, but eventually the house wins. This prediction atleast is giving a good trend, it is predicting correctly most of the time the stock will go down next day (based on last 60 days). This makes you aware of a possibility of a dip or a uptick. This does not have to be 100% correct. Even if we are 55% correct, its good.
So there it is, the whole end-2-end, no secrets, no second episode. Let me know if you see potential here of further analysis, and where you want to take this. I did do lot of other maths here but the focus here is the big picture. Small upticks or dips against the trend should not be counted as the trend gets the money in the long term.
Cheers – Amit Tomar