Chris Vaccaro
6 min readJun 14, 2018

--

What is Data Science, and why should you care?

The world is changing. All sorts of new technology is on the horizon. Ever wonder how Amazon knows exactly what books you might like? How Pinterest figures out which ‘pins’ might interest you. How Apply Music or Spotify finds new songs that you absolutely fall in love with. How do these machines all this out? How are our devices becoming so intelligent? The answer is Data Science.

We now have the ability to track every click, swipe, ‘like,’ and repost that we want.

Within this data is enormous insights about consumer behavior and even general psychology. Companies have enormous amounts of Data and can do all sorts of incredible things with it. Data Scientist: The Sexiest Job of the 21st Century. Data Science can call presidential races, can tell you which movie is best for you based on your tastes and moods. Netflix upgraded itself from a simple DVD Mailing service and made itself a household name by applying algorithms that figure out which movies you’re most likely to enjoy.

Data Science can show you new music that you would like based on your tastes and even mood. Mercedes-Benz for instance is experimenting with heart-rate sensors to determine your mood so it can decide what music to play for you. It can tell if you’re in a relaxed mood and just want some relaxing music, or if you’re in an upbeat mood and want something more energetic.

Artificial Intelligence, Machine learning… all possible through Data Science.

In the early 2000’s Baseball Manager Billy Beane fundamentally changed the way the entire game of Baseball is played. With the help of Harvard Statistician Paul DePodesta, the Oakland A’s stopped relying on Anecdotal evidence and started relying on hard statistics regarding which players to buy.

Before Billy Beane, Bill James and Paul Deposta baseball relied on archaic criteria and anecdotes rather than hard statistics. Baseball players were often picked on criteria such as having the “Good face”… (a face that Baseball scouts can ‘just tell’ will be a good player.) Ideas like “He can’t be good.. his girlfriend is ugly,” or a players look, stance, or personal life were often thrown around.

By moving from a ‘intuitive approach’ to a more data driven approach, the Oakland A’s (one of the poorest teams in Baseball at the time) were able to come neck-and-neck with the Yankees despite the fact that the Yankees had a budget that was almost 15x that of the Oakland As. The results of his experiment the entire game of baseball from the inside out.

This was portrayed in the popular 2011 movie starring Brad Pitt and Jonah Hill. Before you go any further, check out these Moneyball clips, because it’s important in understand the impact smart analysis can have on your results. I recommend watching the clips even if you’ve seen the movie.

Watch these short clips to get a sense of the power of Data Science.

(Note, there are spoilers, but they won’t ruin the movie if you want to watch the full film.)

So with results like these, it’s no surprise that companies are clamoring to find good Data Analysts.

So where can you learn this invaluable skill? Well, it’s probably one of the best times in history to learn a skill like this. With the rise of Open Courseware and MOOCs (Massive Open Online Courses) you can take classes from Ivy League professors from the comfort of your own kitchen (Pants are optional!) Oh, and did I mention the best part? Most of them are free. You can literally take classes from Ivy League professors in your underwear, for free. It’s a great time to be alive. Most Ivy League colleges post an enormous amount of classes to sites like edX and Coursera.

So back to Data Science.

You can take these classes for free, of you can earn a certificate for $441.90 (price as of writing this article)

Data Science Certificate | Harvard Extension — Harvard Extension School

Harvard has great brand recognition, and looks great on a resume, but there may be even better ways of learning this valuable skill. David Venturi from freeCodeCamp described how he dropped out of one of the best Computer Science programs in the country in order to create his own Data Science program using the best Online Courses. He wrote a series of articles where he ranked every Intro to Data Science course on the internet, based on thousands of data points. David Venturi wins extra points for the amazing irony of using Data Science to rank the world’s best Data Science courses.

His pick for the #1 Data Science Class? Kirill Eremenko Data Science A-Z™, a simple course that uses Free Point-and-click software Tableau, which it’s great for beginners.

David says:

Kirill Eremenko’s Data Science A-Z™ on Udemy is the clear winner in terms of breadth and depth of coverage of the data science process of the 20+ courses that qualified. It has a 4.5-star weighted average rating over 3,071 reviews, which places it among the highest rated and most reviewed courses of the ones considered. It outlines the full process and provides real-life examples. At 21 hours of content, it is a good length. Reviewers love the instructor’s delivery and the organization of the content. The price varies depending on Udemy discounts, which are frequent, so you may be able to purchase access for as little as $10.

And notes that one reviewer states:

Kirill is the best teacher I’ve found online. He uses real life examples and explains common problems so that you get a deeper understanding of the coursework. He also provides a lot of insight as to what it means to be a data scientist from working with insufficient data all the way to presenting your work to C-class management. I highly recommend this course for beginner students to intermediate data analysts!

While this courses uses point and click software, if you want to get further into Data Science, some computer programming knowledge is unavoidable. But not to worry, there are a number of amazing classes for those skills too. David Covers those in detail in his other articles:

The best Data Science courses on the internet, ranked by your reviews https://medium.freecodecamp.org/the-best-data-science-courses-on-the-internet-ranked-by-your-reviews–6dc5b910ea40

https://medium.freecodecamp.org/an-overview-of-every-data-visualization-course-on-the-internet–9ccf24ea9c9b

Every single Machine Learning course on the internet, ranked by your reviews https://medium.freecodecamp.org/@davidventuri

If you want to learn Data Science, take a few of these statistics classes https://medium.freecodecamp.org/if-you-want-to-learn-data-science-take-a-few-of-these-statistics-classes–9bbabab098b9

If you want to learn Data Science, start with one of these programming classes https://medium.freecodecamp.org/if-you-want-to-learn-data-science-start-with-one-of-these-programming-classes-fb694ffe780c

Some other recommended resources are the book Data Smart: Using Data Science to Transform Information into Insight from Wiley Publishing. This book uses simple tools like Excel to teach users Data Analysis, then gently eases the reader into learning code.

Excel is not typically used in Data Science, but he explains his rational in his book:

Spreadsheets are not the sexiest tools around. In fact, they’re the Wilford-Brimley-selling- Colonial-Penn of the analytics tool world. Completely unsexy. Sorry, Wilford.

But that’s the point. Spreadsheets stay out of the way. They allow you to see the data and to touch (or at least click on) the data. There’s a freedom there. In order to learn these techniques, you need something vanilla, something everyone understands, but none-the- less, something that will let you move fast and light as you learn. That’s a spreadsheet

Other great resources are O’Reily Media’s series on Data Science, most notably R for Data Science, and R Graphics Cookbook: Practical Recipes for Visualizing Data

Good luck, and happy analyzing!

--

--