Disclaimer, Data Sources, Content Licensing
09.30.2019 : meta, data
Updated on February 7th, 2020
As of the 2020.0.0 release, all data used to generate models and the subsequent predictions is provided by the venerable Elite Prospects.
We reserve copyright on predictions and other content derived from this data, but obviously, not any data provided by Elite Prospects.
The information provided on this website does not, and is not intended to, constitute advice and should not be acted on as such. Instead, all information, content, and materials available on this site are for general informational purposes or experimental purposes only.
Information on this website may not constitute the most up-to-date information. This website contains links to other third-party websites. Such links are only for the convenience of the reader, user or browser; the NHL ML Draft does not do not recommend or endorse the contents of the third-party sites.
The remainder of the original post content is no longer relevant to nhlmldraft.net, but remains below
for historical purposes and because I think the discussion is still of interest. I removed the creative content licensing statement and mark
to avoid confusion.
**Sources of Data:**
Whether or not data is copyrightable is a bit of a gray area. The short version is that data is not copyrightable, but presentation, the act of compilation, etc. probably is.
Many hockey data sites list their policy on usage of their content. Here are a few examples.
* Hockey Reference: Use of Data
* Hockeydb: Usage
These days, we also have popular, pre-fabricated licenses written by experts to govern use of content.
For example, QuantHockey
uses the Creative Commons Attribution + Noncommerical
license, which, as the name suggests, means you can use the data, must name QuantHockey as the source, and should not make any money by presenting data from this source (the use of ads on QuantHockey then seems questionable, but I'm not sure enough of the details to say whether or not that's permitted by the license).
Everything on Wikipedia
is licensed under the Creative Commons Attribution - ShareAlike license. This is the same as above, but without the commerical usage restriction. This site regards Wikipedia as a viable source of data for this reason. However, to be pedantic, we'd have to make sure all that data arrived there from sources with licenses/policies that permitted that.
Because we use Wikipedia for all pre-2019 per-player data, we get some biases in the data. For example, data about a 1st rounder is far more likely to be available on Wikipedia than a 7th rounder who never made the NHL. More on these biasess in a later post.
For new data (e.g., current draft year), I mostly hand-pick data from EliteProspects. EliteProspects
doesn't appear to publish a usage policy, so I use the common sense approach of don't take too much, and don't directly compete (I don't post propspect stats here, for example).
EliteProspects' league leaders data are also used (e.g.)
**Here is a summary of the data sources used to generate ml draft's content:**
* Wikipedia: 100% of pre-2018 per-player data. If it's missing from Wikipedia, I don't have it.
* EliteProspects: Current draft-year propsect data and league leaders data.