Welcome to NHL-ML-Draft

Blog -- Where Am I?: An Introduction to ML Draft
2019-05-12 00:00:00 -- meta, update

Happy Mothers' Day!

The goal of this post is to introduce visitors to the project, and hopefully answer a few questions people may have.

The high-level goal of the project is: I want to use data about a prospective professional hockey player that's available before the NHL draft (for example, statistics from a prospect's season prior to the draft) to predict the outcome of a prospect's professional career. Since, at the present moment, I don't think data tells the whole story, I'd probably walk that back a bit and say: I wanted to try to provide the most insight available given the data.

To drill down on this idea, here's the basic idea this site uses to predict the pro career of a prospect heading into the draft: - Gather pre-draft statistics for a bunch of draft picks from previous seasons. - Gather NHL statistics for those players. For players that are still active, this requries some projection or extrapolation to estimate their career statistics based on their to-date data). - For each prospect, find players from previous draft years who have similar pre-draft data, and use these players to predict a prospect's pro career. This is where the machine learning happens.

For the time being, I'm focusing on forwards, because I think data (or at least, the data I have access to), doesn't tell enough of the story for defensemen. Even for forwards, this project is still experimental in nature and results shouldn't quite be regarded as predictions in their own right.

Prior to the 2019 predictions for forwards, I have been posting 'ex-post-facto' predictions for players who were drafted in previous years. Since we have at least some bearing on those choices' pro careers, this allows us to take a look at how well the model is working. When 2019 predictions are released, we will continue to post these 'ex-post-facto' predictions.

For example, here are some 'ex-post-facto' predictions for forwards drafted in 2013. Note that we were unable to predict an outcome for all draft choices, due to lack of data or other technical reasons. But, for those choices for which we had adequate data, we 'missed' only about 30% of the time. Of course, there are a lot of caveats here.

I suppose my hope is that his project ignites an interest in using analytics in drafting, and provides a fun source of speculation for fans. I'll keep trying as long as people are intersted.

Feel free to reach out on twitter for any specifics or details.