So, I am coming to the end of my second week of teaching myself Machine Learning. So far I am trying to get the higher level concepts down. I recently signed up for a new Coursera course called Machine Learning Foundations: A Case Study Approach. The class is hosted by the University of Washington. The course instructors do a pretty good job of explaining the concepts. The down side is they use a proprietary python library called GraphLab by Dato which is free for the students. But it puts a black box on some of the machine learning techniques.
What I’ve learned so far?
Machine learning falls into two main categories which are supervised and unsupervised learning. Supervised learning is the machine learning task of creating predictions from a data set. Supervised learning is fairly common in classification problems.
Example supervised learning problems are:
A credit card company deciding whether or not to accept an applicant based on certain characteristics.
Who is most likely to survive the titanic crash? https://www.kaggle.com/c/titanic
Finding the worth of your house?
In Unsupervised learning the goal is to extract structure from data. We are basically telling the computer to learn something about data that has unknown class labels.
Examples of Unsupervised Learning:
Figuring out if a person’s yelp review is good or not based text (Sentiment Text Analysis) http://www.yelp.com/dataset_challenge
Google News grouping similar types of news stories
Under these two categories there are machine learning algorithms associated with them.
|Supervised Learning||Unsupervised Learning|
Machine Learning Work Flow
This is one of the most famous machine learning algorithm cheat sheets that I know about.