Data Science and analytics are transforming businesses. It has penetrated into all departments be it Finance, Marketing, Operations, HR, Designing, etc. It is becoming increasingly important for B-school students to have analytical skills and be well versed with Machine Learning and Statistics. Data is being called the new gold. The fastest growing companies in the coming period will be the ones who can make the most sense of data they collect. As through the power of Data a business can do targeted marketing, transforming the way they convert sales and satisfy demand.
But there is a catch, Machine Learning is complex and for those starting out into this field, learning it first time in B-school it seems tough to grasp these concepts together with hectic schedule. B-school student who has no prior experience in coding, machine learning is difficult, one gets lost in all the different algorithms and branches of supervised vs unsupervised learning. The mathematics behind them is tough to understand and has a steep learning curve. For start python or R itself seems like a rough sea which requires some dedicated practice. But, it is of critical importance for a business manager to have knowledge of these. New generation of MBA’s is learning it and older generation should learn it.
My blog series aims to explain these algorithms in simple to understand manner, so that someone with basic knowledge of python can implement them and benefit in their lives and businesses.
So, I decided to ditch the mathematics and dive right into how that algorithm works, why is different from others and why as a businessman I should bother about them. In this article I will explain about 11 branches of machine learning and will introduce each of the branch briefly. In the upcoming articles we will look into detailed description of each node, differences among them and use cases of each.
What is Machine Learning?
Machine Learning is the sub-field of computer science that gives “computers the ability to learn without being explicitly programmed.” ~ Arthur Samuel
It is Netflix telling you watch this movie next, Spotify playing good songs without you touching your phone, its your keyboard in phone, it is how they predict next years sales. Machine learning in its simplest form is learning from data and then predicting or dividing it into meaning parts to make sense of it in a easier and usable fashion.
Your computer can learn from data using algorithms which work on mathematics and statistics to perform the required function. Algorithms find and apply patterns to the data they try to minimize the loss of accuracy in predictions while applying a certain pattern, and then they give us back the best pattern that they could learn from the data.
If you tell your algorithm what each data point means than it is called a supervised learning algorithm whereas if you do not give any labels then algorithm tries to find patterns itself and it is called unsupervised machine learning.
63 Machine Learning Algorithms
The 11 Branches
Machine Learning algorithms can be divided into 11 branches, based on underlying mathematical model:
Bayesian — Bayesian machine learning models are based on Bayes theorem which is nothing but calculation of probability of something happening knowing something else has happened, e.g. probability that Yuvraj (Cricketer) will hit six sixes knowing that he ate curry-rice today. We use machine learning to apply Bayesian statistics on our data and we are assuming in these algorithms that there is some independence in our independent variables. These models start with some belief about data and then the models update that belief based on data. There are various applications of Bayesian statistics in classification as I did in my Twitter Project using Naive Bayes Classifier. Also, in business calculating probability of success of certain marketing plan based on data points and historical parameters of other marketing strategies.
Decision Tree — Decision tree as the name suggests is used to come to a decision using a tree. It uses estimates and probabilities based on which we calculate the likely outcomes. Tree’s structure has root node which gets divided into Internal nodes and then leafs. What is there on these nodes is data classification variables. Our models learns from our labelled data and finds the best variables to split our data on so as to minimize the classification error. It can either give us classified data or even predict value for our data points based on the learning it got from our training data. Decision Tree’s are used in finance in option pricing, in Marketing and Business planning to find the best plan or the overall impact on business of various possibilities.
Dimensionality Reduction — Imagine you got data which has 1000 features or you conducted a survey with 25 questions and are having a hard time now making sense of which question is answering what. That is where the family of dimensionality reduction algorithms come into picture. As the name suggests they help us in reducing the dimensions of our data which in turn reduces the over-fitting in our model and reduces high variance on our training set so that we can make better predictions on our test set. In market research survey often it is used to categorize questions into topics which can then easily be made sense of.
Instance Based — This supervised machine learning algorithm performs operations after comparing current instances with previously trained instances that are stored in memory. This algorithm is called instance based because it is using instances created using training data. k-nearest neighbors is one such example where new location for neighbour is updated all the time based on the number of neighbors we want for our data points and it is done using the previous instance of neighbour and its position which was stored in memory. Websites recommend us new products or movies working on these instance based algorithms and mix of more crazy algorithms.
Clustering — Making bunch of similar type of things is called clustering. The difference here is that we are clustering points based on the data we have. This is an unsupervised machine learning algorithm where algorithm itself makes sense of whatever gibberish we give it. Algorithm clusters the data based on those inputs and then we can make sense of data and find out what all things or points fit together better. Some of the business applications include bundling of products based on customer data of purchase of products. Clustering consumers on basis of their reviews about a service or product into difference categories. These insights help in business decisions.
Regression — In statistics often we come across problems which require us to find a relationship between two variables in our data. We explore how change in one variable can affect the other variable. That is where we use regression. In these algorithms our machine tries to find the best line that can be fit into our data something similar to slope of a line. Our algorithm tries to find the line with best slope to minimize error in our data. This line can be used then by us to make predictions be it in form of values or in the form of probability
Rule System — Rule based machine learning algorithms work on set of rules that are either predefined by us or they develop those rules themselves. These algorithms are less agile when creating a model or making predictions based on that model. But due to their less agility they are faster in doing what they are set to do. These are used to analyze huge chunks of data or even data which is constantly growing. They are also used for classification and can work faster than other classification algorithms. Accuracy might take a hit here but in machine learning its always a trade-off between accuracy and speed.
Regularization — These techniques or algorithms are used in conjunction with regression or classification algorithms to reduce the effect of over-fitting in data. Tweaking of these algorithm allows to find the right balance between training the model well and the way it predicts. Many times we have too many variables or their effect on modelling is huge in those cases regularization works to reduce that high variance in our model.
Ensemble — This method of machine learning combines various models to produce one optimal predictive model. They are usually better than single models as they are combining different models to achieve higher accuracy. More like a perfect life partner. Only drawback being that they might be slow in running. Sometimes when speed is required over accuracy we can switch over to rule-based algorithms or regression.
Neural Networks — Based on the principle of working of neurons in brain, Neural networks are complex algorithms that work in layers. These layers take input from previous layer and do processing. More layers increase the accuracy but make algorithm slow. They work better than other algorithms but due to their computationally expensive characteristics did not gain popularity in past. But now they are back in business as the processors have improved. They are being used for sales forecasting, financial predictions, anomaly detection in data and language processing.
Deep Learning — Deep learning algorithms use neural networks and constantly evolve the model they work on using new data. They learn and better themselves just like a human being would. Self-driving cars are based on these algorithms. I know what you are thinking here, it is what AI is based on. The real terminator will be based on this algorithm but we are way far away from it. There are full businesses running on deep learning algorithms. New delivery systems are under development which use these algorithms, Google’s AlphaGo is another example. Deep learning structures algorithms in layers and uses them to make decisions.