Model Overfitting

In machine learning and predictive analytics, typically we want to find the simpler model that fits our training data just right (not under/overfit), so we can expect the model to generalize enough to predict the future outcomes. The easest way to illustrate this is using a regression chart.  Assume we are using input X to predict outcome Y

With today’s ML techniques, we often introduce more sophisticated models and try to make them converge. But as we start to increase the complexity of the model to capture more signals, model starts to pick up random effects as useful patterns. In general, overfitting can be caused by any the following: small and/or noisy training data, model has too many predictors, very rich hypothesis space. Adding regularization terms to the model and/or finding out the good place to stop the training to prevent overfit become crucial. See the vertical red dashed line in the chart below:

In the context of predictive model (ex. recommender), we can observe the model not only overfitting the training data, it actually overfits the past training data!

This actually brings up very unique challenge of building a model for prediction. Should we use the data point/method that fits current data well, but not predicting the future? How do we evaluate the “predicting power” during model building?

To be continued…

  • Twitter
  • del.icio.us
  • HackerNews
  • Facebook
  • RSS
  • LinkedIn
  • Tumblr
  • email
  • Reddit