So what if you are a non-programmer who wants to scrape data from a website and create visualizations or an analysis in Excel, Tableau, Gephi or whatever tool you are using. You are probably asking yourself “Do I really need to learn Python in order to get this data?” or “Which undergrad/grad student should I choose to do my bidding?” Well..
Microsoft Excel is one of the most powerful spreadsheet tool out there and in my opinion nothing else compares. I will be using Microsoft Excel 2013 for this tutorial. Lets go back to the American Kernel Club Stats page.
Step 1: Select the table and copy
Step 2: Open Excel -> Go to the top left hand corner and click on the Clipboard Icon that says paste underneath->A drop down menu will appear click Paste Special
Step 3: There will be a pop up menu that appears. From the options click HTML.
Your data will be formatted accordingly.
Side Note ***
If you go back to step 2 you will see that you also have the option Refreshable Web Query. If you have data that is not static for example, sports scores, then you can make a web query that refreshes whenever you please.
So from step 2 click Refreshable Web Query. A pop up menu will appear. Copy and past the web address of your site into the Address bar at the top of the pop up menu then click go. You see an arrow in a yellow box symbol next to tables that are on the webpage. Scroll down the page and click the boxes to any tables that you would like to import into Excel.
You will see you data formatted into Excel.
If you currently do not have Excel on your computer don’t worry. Because…
Google Chrome is also a viable option for scraping information from the web. Google Chrome has many web scraper extensions. I use the Google Chrome extension called Scraper which can be found here: https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd Using this tool you can export the data into Google Sheets while other scrapers allow you to export you data into a csv file. It just depends on your preference and needs.
Excel Refreshable Data Query
Google Chrome Web Scrapers