2022 NHL Draft Estimates (Model version 2022.1.0)
05.28.2022 : release, 2022New data is posted for prospects that are eligible for the 2022 NHL Draft:

ML Model Results

xG Model Results
 Isolated impacts on shots odds for OHL drafteligible players
 Isolated impacts on shots odds for QMJHL drafteligible players
The "ML model" noted above is the one that's been the primary feature of this site for the past two or three years. The "xG (expected goals) model" is a new model; more on this later.
ML Model Details
As a reminder, crossvalidation data is generated by essentially testing the model against the players used to train the model. Since we already know something about the outcome for these players, this allows us to judge the model's performance. For example, we can see what the model would have thought about drafting Brayden Point, or Matt Barzal (oops).
For this year, I've restricted the estimates to the CHL leagues (OHL, WHL, QMJHL). In years past, I've tried to consider other leagues, but I think CHLonly is the sweet spot for the time being. Considering more leagues has the obvious benefit of leading to estimates for more drafteligibles, and the other obvious benefit of increasing the size of the data set. However, different leagues can be difficult to compare against each other. Scale factors capture some, but, in my opinion, not all of the inhomogeneity. Additionally, some leagues don't collect as much data as others, so when considering a manyleagues dataset, you're often limited to the league that provides the least data. A goal of this project has always been to favor "accurate something" over "inaccurate everything", and considering the three CHL leagues gets us closest to that goal.
The two metrics I use to evaluate ML model performance are the ROC curve and the PR curve. Here's the ROC curve for the ML model for forwards:
You could use your googlefoo to determine if this is a good ROC curve or not, since it's a subjective matter, but from my POV:
 It's an improvement from years past.
 It's somewhat inflated by the fact that most draft choices do not end up as top6 forwards in the NHL. It's analogous to the way you could have a "model" that classified every draft pick as a miss. This "model" would be about 90% accurate.
The area under the ROC curve (AUC score) can loosely be interpreted as the probability that the ML model will rank a randomly chosen positive instance higher than a randomly chosen negative one, so being able to claim that the model can do this 85% of the time sounds pretty good on the surface, but see above.
Here's the PR curve:
It's about a 20% improvement from last year, although other things have changed in that time.
 Loosely speaking, this curve concerns the model's precision, e.g., how often a player becomes a top6 forward in the NHL given that we estimate they will be.
 It follows from the above that this will lead to a less rosy view of our model. This is somewhat intuitive  We know from experience that most players we think or hope will become top6 forwards in the NHL ultimately don't make it.
All things considered, these are decent results compared to where we've been in years past.
The results for the defenseman model are  as usual  not quite as good:
ROC curve:
PR curve:
These metrics for the defensemen model are about the same as last year. The actual figures are actually slightly worse, but this year's model is likely to be much more stable  last year's was a little "lucky" in some ways.
The defensive model "lightly" considers defensive aptitude for labeling, which turns out to make things rather difficult. We could have gotten better metrics by focusing on offense only (as we do for forwards), but, in my opinion, this would be somewhat unintuitive for readers.
All in all, the situation mirrors reality (or at least pundit hubbub)  drafting / estimating success probabilities for defensemen is tricky!
Expected Goals Model
This year, I also started working on a separate model that's unrelated to the ML model. This model is based on hockeyviz's expected goals (xG) model, but it's scaled down in some ways, and it's also more approximate since many of the quantities used as input to the model are less available for CHL data. Shot data is not available at all (to me) for the WHL, so xG models can be created only for OHL and QMJHL seasons.
The gist of this model is that it seeks to compute goal probabilities for shots, and then attribute different portions of those probabilities to different circumstances of the shot (e.g., who was on the ice for, who was on the ice against, shot location).
A lot can be done with the xG model and output data, but for now, we simply plot players' isolated impact on shot odds for vs. isolated impact on shot odds against. Since this output doesn't say a ton about shot rates or xG rates, the next step would be to compute these rates. Probably a task for next season.
It's also worth mentioning that, at best, the xG models will accurately evaluate players' performance as a prospect; they don't directly predict NHL performance.
It would be possible to one day use xG model output as features/input in the ML model, but that would probably require WHL shot data, and several years of it.
Players of Interest
First of all  why "players of interest"? Why am I not ranking the top n choices? Simply put, all the ML model looks to answer is whether or not a player will be a top6 forward or toppairing (or so) defensemen. The underlying reason for this is that we need a big enough sample of something for the model to work, and we don't have a whole pile of generation players. Furthermore, a player having a high probability of becoming a top6 forward or a toppairing defenseman does not indicate they'll be a superstar, and a lower probability does not indicate they're (traditional) 3rdliner material. Taking all that into consideration, the ML model in some ways paints with a broad brush, and one way to use its output is to find players who are way out of position.
The xG model is still a work in progress, and given a) the approximations used in the model, and b) the simple understanding that performance in junior doesn't guarantee performance in the NHL, we're in the much the same position: results should be taken with a grain of salt, but unheralded prospects who look very good in the model may be worth a second look.
Finegrained rankings, especially for those near the top, is still a task for real life scouting and player evaluation  at least with regard to this project.
Ok, let's look at a few players:
With a top6 probability estimate of 36%, he's extremely good value at where he's expected to be drafted (I'm not sure where that will be, really).Marcus Nguyen (https://t.co/okeJBjOBnn) is an interesting #2022NHLDrafteligible.
— ML Draft (@ml_draft) May 26, 2022
Some positives statistically, time to develop with a late bday.
I don't think any of the major services had him ranked before @SmahtScouting put him at 84 a few days ago.John Babcock  Rated rather highly by the ML model, Babcock is a LHD who put up 23 pts in 57 games with the Kelowna Rockets this year. Looking at the stat ranks used in the ML model, Babcock appears to be a wellrounded, jackofalltrades type player.
HMs and Model Favorites:
Dean Loukus  An OHL overager who, according to the xG model, contributes significantly both offensively and defensively. There's no strict way to rank the xG model output we use since it's 2d, but it appears he's been one of the better 19 yearolds in the OHL this year. The fact that he has a positive +/ in a sea full of negative ones is also interesting. This was also his first season in the OHL despite being an overager.
Nolan Collins  Unranked by CSS at midterm, but picked up a NA CSS ranking at season's end (NA 153). Played in the U18 World Championship. Not much buzz at all about Collins, but scores very well (probably a bit too well) in the xG model. Does not score well in the ML model.
Niks Feneko  The QMJHL version of Collins in some ways. Looks very good in the xG model, but not highly thoughtof by the rankings (124 NA CSS ranking). With an August birthday, he'll still be 17 on draft day.
Cedric Guidon and Kirill Kudryavtsev  I tried to look for players who looked decent in both the ML model and the xG model, and landed on these two. According to the xG model output, these players both appear to contribute significant offense without sacrificing too much defense. Neither are a guaranteed top 6 or top pair player, but both appear to have a significantly better probability than expected for their respective expected draft positions.
I'll cut it off there for now. There are quite a few prospects that are easy to be positive about.
Quick Retrospective, Who Did I Like In 2021?
It was tough to do much useful work last year due to pandemicshorted or even pandemiceliminated seasons, but I still came up with a couple of value picks. Here's a couple players I thought were interesting:
 Riley Kidney  Kidney has an aggregate scout rank around 70, and while it's a little dubious to compare him to players drafted around 70th overall, we'll do it anyway. Looking at the crossvalidation data, you'll see it's not common to find a player with a 15% chance to become a top6 forward around 70th overall (Though Jordan Weal is an exception). Kidney ranks high in most P60 stat ranks, and has Ryan Spooner among his comparables. The model doesn't consider playoff stats, but putting up 17 points in 9 playoff games seems notable.
2022 update: Kidney ended up being taken by the Canadiens around the 2/3 round turn (63rd overall). He finished 7th in QMJHL scoring this year, and looks to be playing in the AHL with Laval next season:
The Canadiens have agreed to terms on a threeyear, entrylevel contract (202223 to 202425) with forward Riley Kidney.#GoHabsGo https://t.co/W9pdoEqGrV
— Canadiens MontrĂ©al (@CanadiensMTL) May 4, 2022
 Olivier Nadeau  Although a 12% chance of becoming a top6 forward isn't very high, like Kidney, it's still significantly more (on average) than a typical 90th overall selection. Probabilitywise, a comparable for Nadeau is Linden Vey, who was drafted 96th overall in 2009. Additionally, Nadeau ranks 1st (among players graded) in almost every stat related to assists. Seems like he may be worth more than a 4th round pick.
2022 update: Nadeau was taken at the top of the 4th round in the 2021 draft (97th overall) by Buffalo. He lead Shawinigan in scoring this season (78 pts in 65 games). Nadeau Earned his entry level contract this year.
What Else?
I'm never too sure what details are of interest to readers, or what questions might be out there, so if there's something specific you're interested in, please reach out on twitter.