Non-Programmer’s Guide to Scraping Data

So what if you are a non-programmer who wants to scrape data from a website and create visualizations or an analysis in Excel, Tableau, Gephi or whatever tool you are using. You are probably asking yourself “Do I really need to learn Python in order to get this data?” or “Which undergrad/grad student should I choose to do my bidding?” Well..ive-got-your-back_o_417729

Microsoft Excel is one of the most powerful spreadsheet tool out there and in my opinion nothing else compares. I will be using Microsoft Excel 2013 for this tutorial. Lets go back to the American Kernel Club Stats page.

Step 1: Select the table and copy

paste

Step 2: Open Excel -> Go to the top left hand corner and click on the Clipboard Icon that says paste underneath->A drop down menu will appear click Paste Special

paste_special

Step 3: There will be a pop up menu that appears. From the options click HTML.

paste_options

Your data will be formatted accordingly.

dog_data

Side Note ***

If you go back to step 2 you will see that you also have the option Refreshable Web Query. If you have data that is not static for example, sports scores, then you can make a web query that refreshes whenever you please.

So from step 2 click  Refreshable Web Query. A pop up menu will appear. Copy and past the web address of your site into the Address bar at the top of the pop up menu then click go.  You see  an arrow in a yellow box symbol next to tables that are on the webpage. Scroll down the page and click the boxes to any tables that you would like to import into Excel.

web_query

You will see you data formatted into Excel.

If you currently do not have Excel on your computer don’t worry. Because…

hey-girl-i-got-your-back

Google Chrome is also a viable option for scraping information from the web. Google Chrome has many web scraper extensions. I use the Google Chrome extension called Scraper which can be found here: https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd  Using this tool you can export the data into Google Sheets while other scrapers allow you to export you data into a csv file. It just depends on your preference and needs.

Additional Resources:

Excel Refreshable Data Query

https://www.youtube.com/watch?v=q7SAqrUVHHw

https://www.youtube.com/watch?v=HFZhvrAib2w

http://www.techrepublic.com/article/pull-data-into-microsoft-excel-with-web-queries/

Google Chrome Web Scrapers

http://dataist.wordpress.com/2012/10/12/get-started-with-screenscraping-using-google-chromes-scraper-extension/

http://schoolofdata.org/handbook/recipes/scraper-extension-for-chrome/

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s