I completely disagree with your point about context of statistics. If you provide stats and use it for an argument, the onus is on YOU to provide the CORRECT context and interpretation of the stats and when it can be applied and when it cannot be applied. The onus is NOT on people to rectify your misuse of the statistics.
Example: Hecarim had a 48-49% win rate in plat+ soloq forever, but it turns out people were building wrong/using the wrong rune. Before people figured out phase rush made him broken, phreak’s ENTIRE argument could be used to make it seem like Hecarim is not a good pick for pro play in the jungle. This was categorically false, and people could have said “yes the soloq winrate for hecarim doesn’t matter because people are playing/building/runing him incorrectly” is absolutely a sound argument. Yes, that person needs to then say “this alternative playstyle, build path, rune choice is wholly superior”.
Saying that soloq data is useful because of large sample size is absolutely garbage for the above reason that if everyone is playing a champ incorrectly, then the soloq winrates will reflect that incorrect playstyle/build/rune, NOT the true strength of a champion when played/built/rune’d correctly.
In the same way, saying soloq and pro win rates for champs don’t matter is an absolutely correct statement because they include SO many more variables than just the champion’s power within a pro game using a specific playstyle/build/rune.
The analogy would be if an economist or epidemiologist made an argument like “people of a certain race are inherently more unhealthy because they have higher rates of heart disease in the US”, which is complete horseshit. A competent statistician would look into the hundreds of factors that could influence this and figure out that “people of this race are X times more likely to live in urban areas and on average have much lower access to affordable healthcare and healthy foods, explaining why the incidence of heart disease amongst them is higher than other races”.
Every competent researcher would discredit the person who made the blanket statement using one or two population-wide overarching stats as incompetent and as having an agenda and not using sound research methodologies. (Source: I TA’d probability theory and stats for economics and financial engineering majors at Princeton).
You can use soloq data for gushing strength in pro play IF and ONLY IF:
1. The vast majority of people are using the optimal core build and rune
2. The pick itself doesn’t depend on coordination to be played (i.e. fasting senna souls) or doesn’t depend on chaos (i.e. most low elo-stomping assassins/fighters)
3. The strength of the champion DOES NOT from coordinated team strategies (i.e using multiple ults combined to blow enemy summoners 50s before an obj when your ults will come online faster than the enemy ults/summoner spells).
Phreak has a degree in economics but his use of stats makes it hard to believe. (I don’t actually think his degree is make-believe, btw. I’m just shocked that an economics major has presented this as public-facing analysis he believes is correct).
He incorrectly determines R squared using categorical variables (tier) which cannot be ordered (as opposed to numerical data such as MMR) but somehow thinks he can make the claim that pro play exists “to the right” of these categorical variables?? If he used MMR average by division (Gold 4, Plat 2, Masters, or by 200 LP increments), we would probably see that Lillia’s winrate as MMR increases is parabolic, not linear. But for some reason, he still uses a linear regression R squared value to claim Lillia is consistently weak??
The same thing is probably true for Rumble jungle Phreak shows at the end of his video. But because he uses data aggregated ant categorized poorly and thus uses linear regression incorrectly, Phreak incorrectly believes that the data supports his assertion that Rumble jungle is bad. Both Rumble and Lillia see a huge upswing between Diamond and Masters. That just means that people below Masters probably are not good enough to pilot those champions in the jungle. How on earth do you think that the skill difference between Plat and Diamond is the same as Diamond to Masters+? How on Earth are you grouping Masters+ as one category when the difference between Masters and high Challenger is the same as the difference between silver and diamond (using LP as numerical basis)??? This all makes no sense.
If a student presented this video as work in intro stats at Princeton, I’m not sure if this video would even get a passing mark. Riot internally has a wealth of data that we cannot access via the developer APIs (it would be sick to run regressions of jungle matchups across MMRs to compare item builds, for example), yet this video is somehow making money as a forefront of analysis in League of Legends using at best high school statistics that the teacher would say “the methods are technically not used correctly, but good effort because your calculations themselves are correct”.
I have no idea if Rumble is actually a good jungler or not. Phreak could be correct about him being better in solo lane vs jungle. But Phreak’s whole video can be boiled down to 1) there are strong soloq strategies that could be picked in competitive but are not yet because pro players don’t know everything, and 2) people bad at rumble shouldn’t play rumble. These are statements any League player could tell you.
The only point in this video that has merit is that there are many different builds and strategies that are found in soloq, and it is the responsibility of pro players and analysts and coaches to research and test these builds and strategies (referring to his example of Rylais Seraphine). This is definitely true, as many OP and META-defining builds in the past have actually come from not just from soloq, but low-elo soloq (Blue Ezreal, AP Tryn, AP YI, Shaco support etc.).
Sorry for typos and poor formatting; I’m on mobile. But this video had me fuming.