Some questions to ask yourself if you want to be a data scientist

Print Friendly, PDF & Email
Rate this post

Last October, the Harvard Business Review published an article called “Data Scientist: The Sexiest Job of the 21st Century.” I could virtually hear the rejoicing in the work hallways of analysts, mathematicians, statisticians, and computer scientists everywhere. At last, recognition!

While it is debatable in this case if the job really is sexy or even hip and will make you either, there’s really no question that the rise of analytics and big data are making these skills increasingly in demand. Is this the right job for you?

A good place to start to understand what is needed to be a data scientist is at the INFORMS Analytics Certification website. It costs money to get this, but the program information gives you an idea of the kinds of questions on the test, the sorts of case studies with which you should be comfortable, and the books and websites you can use for further learning.

In a more informal way, let me here ask some questions you should answer about yourself and your knowledge to see if this is a career or job you might consider. I’ve included a few technical questions to encourage you to learn more about some of the disciplines involved.

  • Do you suffer from math anxiety? Does solving equations, working with matrices, or making sense of table or graphs scare you? If so, this probably is not the field for you.
  • Are you comfortable with statistics? Could you in your spare time over the next month do what would equate to a first, solid, mathematically sound statistics course? Would you get an A for your efforts? You’ll need statistics to understand the data and to give yourself sanity checks about the conclusions you are drawing.
  • Is Microsoft Excel or OpenOffice Calc your favorite tool in your office productivity suite? Doing real analytics and big data often goes well beyond what you can do in a spreadsheet, but if this kind of software terrifies you, data science might not be a good match.
  • Do you know how services like Netflix choose what movies you would like to watch? Make an educated guess and then go learn some of the techniques. I won’t give you a reference, go explore what you find on the net.
  • Do you understand the differences between descriptive, diagnostic, predictive, and prescriptive analytics? Where does optimization fall among these?
  • Do you like saying the word “stochastic”? Do you know what it means?
  • Who was Andrey Markov and what was his obsession with chains?
  • What criteria and analysis would you use to predict who will win the next World Series, Super Bowl, or World Cup?
  • Under what situations would you use Hadoop, Hive, HBase, Pig, SPSS, R, or CPLEX?
  • How would you go about constructing your personal profile from all the public data about you on the web? This could be from yourself (e.g., your Twitter feed) or produced by others. Include your gender, your approximate age and income, the town in which you live, the high school to which you went, your hobbies, the name of your significant other, the number of children you have, your favorite color, your favorite sport, your best friend’s name, and the color of your hair. Does this scare you?
  • When can Twitter add to your insight about marketing campaigns and when does it just add unnecessary noise?

This is by no means an exhaustive list, but if these topics intrigue you, you have or are willing to get the technical background, and you know who Nate Silver is, you just might have a career in data science.