Predictive analytics, the basic idea

If you’ve ever bought an insurance policy, you have seen the output of predictive analytics.

If you’ve ever wondered how a restaurant knows how many “specials” to prep for, you’ve thought about predictive analytics, though in practice it was probably informal.

Analytics means munching on a whole lot of data, sorting it out, deciding what is important, and then understanding what it tells you. The result can give you a view of what happened in the past and give you a dashboard of the situation right now. This is usually what is called business analytics. If you are interested in using the information to try to guess what will happen in the future, that is predictive analytics.

Sounds simple, right? Ready to predict the weather or the stock market?

Predictive analytics is not about guaranteeing a certain result, but instead about giving you an idea of what is likely to happen, with a statement of how accurate the prediction will probably be.

What can get in the way of making sound predictions?

  • Bad data
  • Irrelevant data for what you are trying to model
  • More generally, too much or little information that is weighted in the wrong way
  • Random events
  • The wrong model for prediction
  • Misinterpretation of the results

You get the idea.

For the insurance example, information about your family health history, whether you smoke or drink, how much you drive daily and where, and similar factors all go into how much a life insurance policy will cost you. Remember that the insurance company is out to make money, so on average its predictions across all its costumers need to have the total people pay for policies exceed what the company pays out on them, less the usual business overhead. People who do this work in associating the risks with the financial consequences are called actuaries.

How about when you go to a big box store and buy some electronic gear or an appliance? In my experience, when I check out I am always offered the chance to buy an extended warranty. Your bet is that if you get it then you will indeed need to take advantage of the warranty for repair or replacement, so it is worth the extra money. If you don’t purchase it, then you think it is unlikely that product will break and you will have to repair or replace it at your expense if it does.

In my experience, AppleCare is worth the investment, but I digress.

The big box store is betting that enough people will buy the warranties to more than cover the cost of any who need them. Some will certainly make claims, but if enough don’t, it will be profitable.

A wrong guess by the store can be very expensive. The people who do the predictive analytics need to look at past warranty claims based on the type of product, manufacturer, geography, and perhaps time of year. There can be other factors as well, and they may be surprising. What those factors are may be highly valued corporate intelligence.

What about that restaurant? What can affect how much food it decides to order or how much staff to have working on a given day?

  • Past success with menu items
  • Day of the week
  • Time of the year
  • Weather
  • Local seasonal population variations
  • Experience of available staff

A restaurant in New York City will have different considerations from a restaurant on Main Street in the college town in which I live in the Finger Lakes region of upstate New York. The above list is not exhaustive.

By looking at past trends and combining it with weather forecasts and the college schedule to know how many students are in town, the restaurant on Main Street can better determine if it will maximize revenue and minimize expense. Spreadsheets and experience suffice for most restaurants in current practice, I believe.

So predictive analytics as a practice means getting and then looking at the available data from the past and forecasts for the future, deciding what is important and how much so, and then producing a result that will help you succeed in what you are trying to accomplish.

You don’t just throw a lot of random data into some software and expect a perfect answer to pop out. While certain classes of problems are repeatable for many clients (e.g., restaurants) others will need subject matter experts to filter out the noise, decide what data to use and how to state how relatively important it is, and then what mathematical techniques to invoke to get an answer.

It will probably require several iterations to get the model accurate. One test of it is to use historical data to see how well the model would have predicted what really happened. If the accuracy was low for that, there is no reason to think it will get better for the future.

This entry was posted in Software and tagged . Bookmark the permalink.

One Response to Predictive analytics, the basic idea

  1. Bill Plonk says:

    What is frustrating and sad is how seldom these techniques are used in clinical medicine. It’s likely that we could develop a reasonably robust predictive model, say, of who is likely to die in the next six months based on hospital admission data, but no one is willing to do it. First, because we all expect to be the exception rather than the rule, and such information might lead to reducing the aggressiveness of care and thus the profit of the institutions and physicians who provide it. Ignorance is bliss, and quite lucrative.

Comments are closed.