ChiPy Mentorship Week 2: High Level Overview of Machine Learning

So, I am coming to the end of my second week of teaching myself Machine Learning. So far I am trying to get the higher level concepts down. I recently signed up for a new Coursera course called Machine Learning Foundations: A Case Study Approach.  The class is hosted by the University of Washington. The course instructors do a pretty good job of explaining the concepts. The down side is they use a proprietary python library called GraphLab by Dato which is free for the students. But it puts a black box on some of the machine learning techniques.

What I’ve learned so far?

Machine learning falls into two main categories which are supervised and unsupervised learning. Supervised learning is the machine learning task of creating predictions from a data set. Supervised learning is fairly common in classification problems.

Example supervised learning problems are:

A credit card company deciding whether or not to accept an applicant based on certain characteristics.

Who is most likely to survive the titanic crash? https://www.kaggle.com/c/titanic

Finding the worth of your house?

In Unsupervised learning the goal is to extract structure from data. We are basically telling the computer to learn something about data that has unknown class labels.

Examples of Unsupervised Learning:

Figuring out if a person’s yelp review is good or not based text (Sentiment Text Analysis) http://www.yelp.com/dataset_challenge

Google News grouping similar types of news stories

Under these two categories there are machine learning algorithms associated with them.

Supervised Learning Unsupervised Learning
Regression Clustering
Classification Dimension Reduction

 

 

Machine Learning Work Flow

drop_shadows_background

This is one of the most famous machine learning algorithm  cheat sheets  that I know about.

 

 

Chi Py Mentorship Program Day 2: Setting Up a Python Dev Environment on the cloud 9

What is Cloud 9?

Cloud9 is a cloud service that allows you to create an Ubuntu work space with a snazzy code editor in the cloud. So if you want to quickly create a small app to be shown on your portfolio without paying a money for hosting space Cloud9 is just for you. With the free version of Cloud9 there are you get 512 MB of memory and 1GB of disk space which is not a lot. Lucky for us Anaconda has a mini version of itself called Miniconda which does not contain 300 python packages as Anaconda.

Step One Create a Workspace

create_workspace

Name your workspace and choose the custom Ubuntu template.

custom

Scroll down to the lower left corner and press the green create workspace button

 

Step Two- Download Mini Anaconda

 

minicoda

Right click the link for the Ubnutu 64 downloads for the version of python you want to work with.

 

linux

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

 

 

Step Three-Installation

To install it in the shell by using this command

bash Miniconda3-latest-Linux-x86_64.sh

After accepting all the terms from the licensing agreement.

Step 4-Python Package downloads

Download python packages

conda install package-name

For Example

conda install numpy

*Note to use IPython use the command

ipython notebook --no-browser --port=8080 --ip=0.0.0.0

 

Make sure you have the –no-browser  command because it will try to look for a viable browser to load IPython. In order to see your notebook open up a new browser and load the url  http://{workspace}-{username}.c9.io/ .

And that’s that.

 

Chi Py : Intro to Data Science Mentorship week 1

So I met my mentor for the Chi Py (Chicago Python) group Intro to data science mentorship earlier this week and he is pretty awesome. His name is Alexander Flyax (@aflyax).  Alexander has a degree in Neurology and did two Post Doc appointments in the past. He also went through the Chi Py mentorship as well. During our first meeting he gave me a long list of resources to get through by the end of 12 weeks. And I’m like…

But as a person who is currently unemployed and has no potential job prospects all I can say to myself is …

Here is the game plan for my first week.

Tasks Day
Setting up Data Science Environment
DataQuest exercises for 2.5 hours
Day 1
DataQuest exercises for 2.5 hours
R Bloggers 15 hour Data Science Course for 3 hours
Days 2-7

By the end of this week hopefully I will figure out a project to focus on throughout this journey.

ChiPy Mentorship Oct-Dec 2015

I know that I have not been on my blog for a very long time (8 months to be exact). But after graduating from my Masters program and dealing with the soul crushing task of job hunting and interviews I finally have a bit of good news. I just got accepted to be apart of the Chicago Python (ChiPy) Mentorship Program.This program will be 12 weeks long and I am going to  be apart of the Introduction to Data Science Track. I can’t wait for what the future holds. 🙂