Data Analysis with Lee Hawthorn

Data Analytics modelling, why tune by hand?

Topics: Analysis

When we’re carrying out analysis once we’ve got clean transformed data we have to create a model.

There are many types of models that can be used depending on the type of analysis or prediction being made. For instance, predicting a class, predicting values, finding unusual points.

Within each collection of models I’d really like to be able to spin through the models and selectively apply each to my dataset. I want to see the Accuracy, p-Value, Sensitivity, Specificity etc.. ranked.

With the model algorithms already pre-baked why can’t we just consume them in a fairly efficient way?

Of course we can do this by hand with Python or R but it would be much better if the software handled this type of plumbing/set-up.

Here’s an example of doing it with R from Suraj V Vidyadaran. He cycles through 17 classification algorithms applying them and outputting a confusion matrix for each one. This is a great resource for learning R but it also shows how there are patterns in the modelling that be abstracted away, in my opinion.

Previous PostData Analysis Prep
Next PostCleaning Data with SQL