Looking For Some Math/Excel Gods

Randomonium·12/23/2016, 10:36:26 PM·2 votes·666 views

So I've tracked the stats of every game I've played since January 1, 2015 in an excel spreadsheet. I've separated every game by role and also on whether I won or lost. To date I've tracked 2,300 games with over 100,000 cells of data. Obviously I've taken the first step to create X-Y scatters for each variable to see if there is any obvious patterns between wins and losses and also to track my improvements in various stats but now I'd like to take it to the next level.

Basically what I'm looking for are statistical formulas or excel formulas/macros that can determine the importance of a given stat (how that stat affects your chances of winning.) Any help would be greatly appreciated and I'd be happy to share my data and my results with anyone who can give me some advice.

Once I feel like I have some concrete evidence to support my hypotheses I will be sharing my findings with the rest of the community so please don't hate me for not sharing everything right now. However, I'm happy to answer any questions you have on what I've seen in the data so far.

6 Comments

ElysMustache12/23/2016, 11:38:05 PM1 votes

I might be able to help.

Here is an example of my work with League of Legends in Excel (my Season 6 ranked games): https://docs.google.com/spreadsheets/d/1R2bROU-o2eEgn7u93dvK-82vlDrbEHRtoPJtlXW91Mg/edit?usp=sharing

Z Statistic12/24/2016, 12:05:48 AM1 votes

You have now learned the very difficult lesson of statistics: throwing money, time, and effort at the question doesn't make it any better.

If you simply ripped the records from a single computer in your League folder, well, nothing of value lost. If you had to rip them from multiple hard drives if you play on more than one computer, then I am sorry for the time you wasted. If however you didn't know any of this and actually went through the difficulty of gathering this data by hand, then you have my deepest condolences.

Your data is compromised by your sampling methodology. The most you will be able to say is that X does not cause Y.

If you're looking to find evidence that farming increases your winrate or that minimizing deaths increases your winrate or something extremely specific such s getting dragon after 20 minutes is beneficial whereas getting it before is detrimental, you will be sorely disappointed. If anything, this is basically just confirmatory procedure of what we already know. You have no basis for discovering something new because as I have described your sampling methodology is not suitable for causal relationships.

Honestly, I don't even know if you can read ANYTHING general from this data. You are not representative of the community's skill, and considering that MMR is hidden, I'm not sure if you can control for your evolution in skill.

Non-heteroscedasticity? I can deal with this. Non-normality? Easy peasy. Non-independence? Difficult, but doable.

I don't know how large your sample size is either, but considering that you are looking for an enormous amount of variables and potential correlations, you have to have a very low sensitivity for each variable in order to make a reasonable conclusion of non-random chance for any significant findings.

Off the top of my head, Confirmatory Factor Analysis/Principle Components Analysis (both are available in Excel iirc), but it basically is simply writing off any possibility of X causing Y and instead assumes P causes X which is associated with Y where P is a latent/confounding variable. The weakness is that it stops becoming "ok, getting more gold makes you win more" and becomes "ok, teamwork and individual mechanical skill seem to help you win more." It's more robust but it's more nebulous, and honestly I have no idea what the predictive usefulness of any findings would be.

So, yeah, if you're looking to see if Hydra helps you win more with certain champions, you're out of luck, buddy. If you're looking at simply discounting some myths by finding a non-correlation, you'll have to do some (Read: a shit ton) of controls and transformations, but it's doable. If you're looking for some overarching design of constructs, you can find one for yourself but not for others, and it would be nebulous.

Ultimately, if there's any usefulness garnered from this data, it really only pertains to you specifically.