Thank you Peter.

1 min readJan 13, 2020

Thank you Peter. Cross validation is used with GridSearchCV using a k-fold of 10. There is also the split so that 20% of the unseen data is tested. It’s common to use training, validation, and testing. Sorry that wasn’t clear.

I actually tried different forms of scaling and standardizing. I went with the MinMaxScaler because it gave the best diagnostic. That is true that the outliers are not as reduced as with standardizing the data, however the data is more intact. They usually recommend MinMaxScaler first. Here is the article on that.

https://towardsdatascience.com/scale-standardize-or-normalize-with-scikit-learn-6ccc7d176a02

Written by Steven Smiley

No responses yet