JoeKurokawa

Machine Learning part VI: Building more models in Regression

Last week we built a linear regression and a polynomial regression from historical Bitcoin prices: http://jkurokawa.com/2018/04/25/learning-ml-part-v-creating-a-model-using-scikit-learn/.  This week we will make more models from historical prices of other cryptocurrencies like Litecoin, Ethereum and Ripple. Let's see if we can produce a linear model out of the data first. I am only analyzing this year's prices as to eliminate some of the volatility experienced in the market last year that will make it hard to come up with an accurate model.

Like last week, we will loop through our rows in the extracted csv using loadtxt, put the datetime and corresponding prices to arrays and store then as variables.

And then , we take all but the last 10 data points as the training set and the last 10 points as the testing set. We do this in order to use cross-validation to make sure that the model created from the training set can be validated against the testing set. We print the three plots for each of the coins.

and the console output is this:

The MSE for LTC is: 145.5845122372679
The coef. for LTC is: [-7.26416342e-06]
The MSE for ETH is: 16752.089816773823
The coef. for ETH is: [-7.52252359e-05]
The MSE for RIP is: 0.12567067740217847
The coef. for RIP is: [-2.76329702e-07]

We see that the Error for ETH is high so the linear model does not predict the price of Ethereum coins. The Error for the LTC and RIP models are much smaller so they follow a more linear path.

The above graph depicts just the linear model and plotting it against the testing data.

We see that the LTC graph does not follow the model very well. Let's try to examine LTC prices using polynomial regression for the LTC graph:

 

Just list last week we will cycle through the various polynomial degrees and make a model as we go along. At every degree we will print out the  MSE and plot the model on our graph.

the coefficients for the degree 2 is: [ 1.38845504e+05 -1.74472013e-04 5.47667364e-14]
the mean square error for degree 2 is: 1024.9864388458207

the coefficients for the degree 3 is: [ 3.62199542e+02 4.07698308e-03 -5.36497513e-12 1.76491388e-21]
the mean square error for degree 3 is: 924.1754899788522

the coeficients for the degree 4 is: [ 1.33493162e+03 1.16188108e-30 2.68213895e-12 -3.53062801e-21
1.16166650e-30]
the mean square error for degree 4 is: 924.6724109907932

the coeficients for the degree 5 is: [ 6.82213034e+02 1.61019028e-02 -2.10148658e-11 1.38657496e-21
7.12197734e-30 -2.31815648e-39]
the mean square error for degree 5 is: 921.2089860046844

the coefficients for the degree 6 is: [-6.10565773e-01 -8.30229727e-46 3.78262115e-36 2.90815712e-18
-5.74468520e-27 3.78262283e-36 -8.30229711e-46]
the mean square error for degree 6 is: 515.7131605348862

As you can see the MSE for each progressive degree is decreasing. So as we go higher in polynomials, we see a closer and closer fit to the model. Polynomail of degree 6 seems to be the best fit.

Congrats! You made Regressions in Sci-Kit Learn!

Leave a Comment