Power Ranking using Neural Network

J.E. · Post by **J.E.** » Wed Aug 06, 2014 9:18 am

I started playing around with Neural Networks a bit, using the python library Neurolab

I was a little surprised how easy it was to set this up, as, in this case, the basic set up is identical to when I was doing power rankings with regression

Code: Select all

X = zeros( (number_of_games, number_of_teams) ) 
y = zeros( (number_of_games, 1) )
#Fill X with 1s and -1s (for home&away team)
#Fill y with 1s and 0s (1: hometeam won, 0: hometeam lost)

Then setup your NN. We want 30 input variables with inputs ranging from -1 to 1, and one output

Code: Select all

net = net.newff([[-1, 1], [-1, 1]...[-1, 1]], [1])

Then train it, using

Code: Select all

net.train(X, y, epochs=500, show=100, goal=0.02)

To forecast future games you do

Code: Select all

X = np.zeros( (1, 30) )
X[0, column_of_hometeam] = 1
X[0, column_of_awayteam] = -1
out = net.sim(X)

..and you're done

This specific example is using a Feed Forward Multilayer Perceptron with no middle layer(s). I don't really see how a middle layer would help here.
With no middle layer, training the network is really fast (~4 seconds on a decent PC)

Here's the "power ranking" given by the NN

Code: Select all

╔═══════════════════════════════╦══════════╗
║             Team              ║ Exp Win% ║
╠═══════════════════════════════╬══════════╣
║ San Antonio Spurs _2014       ║ 77.4     ║
║ Oklahoma City Thunder _2014   ║ 74.0     ║
║ Los Angeles Clippers _2014    ║ 71.2     ║
║ Houston Rockets _2014         ║ 68.2     ║
║ Portland Trail Blazers _2014  ║ 67.7     ║
║ Indiana Pacers _2014          ║ 65.9     ║
║ Miami Heat _2014              ║ 65.0     ║
║ Golden State Warriors _2014   ║ 63.7     ║
║ Memphis Grizzlies _2014       ║ 62.9     ║
║ Dallas Mavericks _2014        ║ 61.3     ║
║ Phoenix Suns _2014            ║ 60.1     ║
║ Chicago Bulls _2014           ║ 56.7     ║
║ Toronto Raptors _2014         ║ 56.4     ║
║ Brooklyn Nets _2014           ║ 51.5     ║
║ Washington Wizards _2014      ║ 51.1     ║
║ Minnesota Timberwolves _2014  ║ 50.6     ║
║ Charlotte Bobcats _2014       ║ 50.3     ║
║ Denver Nuggets _2014          ║ 46.2     ║
║ Atlanta Hawks _2014           ║ 44.8     ║
║ New Orleans Pelicans _2014    ║ 43.8     ║
║ New York Knicks _2014         ║ 43.1     ║
║ Cleveland Cavaliers _2014     ║ 38.0     ║
║ Sacramento Kings _2014        ║ 36.6     ║
║ Los Angeles Lakers _2014      ║ 35.2     ║
║ Detroit Pistons _2014         ║ 33.5     ║
║ Utah Jazz _2014               ║ 32.8     ║
║ Boston Celtics _2014          ║ 28.4     ║
║ Orlando Magic _2014           ║ 25.7     ║
║ Philadelphia 76ers _2014      ║ 20.9     ║
║ Milwaukee Bucks _2014         ║ 17.1     ║
╚═══════════════════════════════╩══════════╝

As far as I can see, the West teams got a bit of a bump over their actual regular season Win%, and the East teams got bumped down. So, it seems the NN was able to incorporate strenght of schedule

So far so good.
The problem I could see NN having is that they have no built-in regression to the mean like Ridge Regression does. Maybe there's a way to design them so that they do, but I don't know. That's not too big of a problem here, but probably will be if you try to use a NN for NBA matchupdata where many players have played few possessions

mystic · Post by **mystic** » Wed Aug 06, 2014 11:26 am

Sounds interesting. Is there any literature online available to get a better understanding what "Neural Network" means? I saw that term mentioned before, but never got the opportunity to learn much about it. I would appreciate some helpful links at best at a level someone with a master degree in physics can comprehend.

Btw, in comparison my adjusted Win%, based on adjusting the actual win% of the teams with SOS. The method first translates the win% of the teams into a net rating by using pythagorean expectation with a exponent of 14.5 and then adds in the SOS term. Afterwards the adjusted net rating will be translated into win% by using again pythagorean expectation.

Code: Select all

Team                     Adj. Win%
San Antonio Spurs              0.774
Oklahoma City Thunder          0.739
Los Angeles Clippers           0.712
Portland Trail Blazers         0.680
Houston Rockets                0.678
Indiana Pacers                 0.666
Miami Heat                     0.658
Golden State Warriors          0.640
Memphis Grizzlies              0.634
Dallas Mavericks               0.625
Phoenix Suns                   0.602
Toronto Raptors                0.573
Chicago Bulls                  0.569
Brooklyn Nets                  0.508
Washington Wizards             0.508
Minnesota Timberwolves         0.505
Charlotte Bobcats              0.496
Denver Nuggets                 0.465
Atlanta Hawks                  0.445
New Orleans Pelicans           0.435
New York Knicks                0.431
Cleveland Cavaliers            0.371
Los Angeles Lakers             0.368
Sacramento Kings               0.359
Detroit Pistons                0.328
Utah Jazz                      0.314
Boston Celtics                 0.277
Orlando Magic                  0.251
Philadelphia 76ers             0.226
Milwaukee Bucks                0.159

I ran a quick correlation analyses with your presented numbers and came up with R²=0.998 ...

J.E. · Post by **J.E.** » Wed Aug 06, 2014 12:41 pm

mystic wrote:Sounds interesting. Is there any literature online available to get a better understanding what "Neural Network" means? I saw that term mentioned before, but never got the opportunity to learn much about it. I would appreciate some helpful links at best at a level someone with a master degree in physics can comprehend.

This looks like a good explanation, although I only just skimmed it

There are obviously many ways to create team rankings that are all very similar. The goal should be to find the method that's most accurate. I doubt NN can outperform RidgeRegression but it's fun to try

mystic · Post by **mystic** » Wed Aug 06, 2014 1:22 pm

Thanks for the link!

Well, from the first glance, it seems that the NN approach here is faster overall than my approach given the fact that the SOS is calculated via running OLS on pace adjusted scoring margins first. Thus, I need some additional steps before I get the adj. Win%. And the comparison shows that the differences are marginal (Blazers/Rockets as well as Lakers/Kings seemed to be the only teams switched). So, at least for something like an Adj. Win% the NN approach makes some sense in terms of effort. Nonetheless, I know that adj. Win% is a worse predictor than values based on regression; but the combined adj. win% and regression results (as mentioned before in the other thread) showed a better predictive power than either alone. So, from my perspective it looks like the best way too get "my power ranking" would be using NN results plus RR results linear combined in some fashion. Well, and that's why I find that so interesting and will try to learn as much as needed about the Neural Network to be able to incorporate that in a useful fashion into my stuff ... so, thanks again for sharing your idea/insight.

P.S. Just found out that my Matlab version has a Neural Network toolbox (not quite sure whether that is true for all Matlab versions) ... might make that even easier for me ...

xkonk · Post by **xkonk** » Wed Aug 06, 2014 1:37 pm

I had a neural networks class way back and don't remember much, but I know that some are mathematically equivalent to regression. It seems reasonable that there could be a NN that incorporates regression to the mean, even if it might do it differently than regularized regression.

v-zero · Post by **v-zero** » Wed Aug 06, 2014 5:36 pm

So, question time: what are your inputs and outputs using functionally - I assume the input is linear and the output is some sort of sigmoid? If the output is linear this is mathematically equivalent to a linear regression, if it is a sigmoid it will be to one extent or another equivalent to logistic regression.
Only if you have a hidden layer of nonlinear functions will it be particularly different from regression of one form or another.

On the note of regularization....there are numerous methods, possibly the easiest of which is to put in dummy measurements for each variable, as is done in ridge regression.

l_davies93 · Post by **l_davies93** » Wed Aug 06, 2014 5:43 pm

v-zero wrote:the easiest of which is to put in dummy measurements for each variable, as is done in ridge regression.

Are you sure that's done in ridge regressions?

v-zero · Post by **v-zero** » Wed Aug 06, 2014 6:57 pm

l_davies93 wrote:Are you sure that's done in ridge regressions?

Ignoring the bayesian interpretation, or rather giving a 'classical' interpretation, yes.

l_davies93 · Post by **l_davies93** » Wed Aug 06, 2014 7:09 pm

v-zero wrote:
l_davies93 wrote:Are you sure that's done in ridge regressions?
Ignoring the bayesian interpretation, or rather giving a 'classical' interpretation, yes.

I never saw this in my studies, but maybe I didn't pay enough attention in class haha. From this forum:

"Mathematically speaking that is incorrect. The ridge regression refers to Tikhonov regularization, which is just the induction of the lambda as a constrain (Hoerl and Kennard, 1970) based on Tikhonov's work on stabilization of inverse problems (1943). The dummy measurement, you are speaking of, is not per se part of the ridge regression, but a prior distribution of the coefficients based on Bayesian probability (which is the basis of linear bayesian regression)."

It seems that you may be referring to a Bayesian ridge (i.e. one with a prior which I see is very popular on this forum).

v-zero · Post by **v-zero** » Wed Aug 06, 2014 7:22 pm

Ridge regresdion is merely a special case of bayesian linear regression in which all priors are zero. I am familiar with the thread you quoted, in which mystic was speaking semantically. In terms of the mathematical operations a ridge regression is equivalent to a weighted linear regression with suitable dummy measurements.

l_davies93 · Post by **l_davies93** » Wed Aug 06, 2014 8:07 pm

v-zero wrote:Ridge regresdion is merely a special case of bayesian linear regression in which all priors are zero. I am familiar with the thread you quoted, in which mystic was speaking semantically. In terms of the mathematical operations a ridge regression is equivalent to a weighted linear regression with suitable dummy measurements.

Oh I see what you mean, it can be thought of as a prior distribution (with a mean of zero). That makes sense. My lecturer never looked at it from this viewpoint, but it's interesting.

J.E. · Post by **J.E.** » Wed Aug 06, 2014 9:13 pm

v-zero wrote:So, question time: what are your inputs and outputs using functionally - I assume the input is linear and the output is some sort of sigmoid? If the output is linear this is mathematically equivalent to a linear regression, if it is a sigmoid it will be to one extent or another equivalent to logistic regression.

Input is either -1 or 1. Outputs by the net appear, at first glance, to be linear.

On the note of regularization....there are numerous methods, possibly the easiest of which is to put in dummy measurements for each variable, as is done in ridge regression.

How exactly would you feed those measurements into the NN, though?

v-zero · Post by **v-zero** » Wed Aug 06, 2014 9:25 pm

If it is linear then it is equivalent to linear regression.

As for how to put in the dummy measurements - if it has a weight vector then just an x vector of zeros for each variable, with its column set to one, the corresponding y-value set to zero, and then a weight in the weight vector which acts as the value you vary in order to increase or decrease regularization. You then just feed those in along with the real measurements.

J.E. · Post by **J.E.** » Thu Aug 07, 2014 11:44 am

Alright that was easy. Adding 10 rows of dummy data for each team (because Ridge Regression said lambda was 10) leads to

Code: Select all

╔═════════════════════════╦══════╗
║          Team           ║ Win% ║
╠═════════════════════════╬══════╣
║ San Antonio Spurs       ║ 75.2 ║
║ Oklahoma City Thunder   ║ 71.6 ║
║ Los Angeles Clippers    ║ 69.1 ║
║ Houston Rockets         ║ 66.5 ║
║ Portland Trail Blazers  ║ 66.3 ║
║ Indiana Pacers          ║ 64.9 ║
║ Miami Heat              ║ 64   ║
║ Golden State Warriors   ║ 63   ║
║ Memphis Grizzlies       ║ 62.1 ║
║ Dallas Mavericks        ║ 60.5 ║
║ Phoenix Suns            ║ 59.6 ║
║ Chicago Bulls           ║ 56.6 ║
║ Toronto Raptors         ║ 56.4 ║
║ Washington Wizards      ║ 51.8 ║
║ Brooklyn Nets           ║ 51.6 ║
║ Charlotte Bobcats       ║ 50.9 ║
║ Minnesota Timberwolves  ║ 50.6 ║
║ Denver Nuggets          ║ 46.6 ║
║ Atlanta Hawks           ║ 45.7 ║
║ New Orleans Pelicans    ║ 44.3 ║
║ New York Knicks         ║ 43.7 ║
║ Cleveland Cavaliers     ║ 38.9 ║
║ Sacramento Kings        ║ 37.5 ║
║ Los Angeles Lakers      ║ 35.8 ║
║ Detroit Pistons         ║ 34.5 ║
║ Utah Jazz               ║ 33.8 ║
║ Boston Celtics          ║ 29.9 ║
║ Orlando Magic           ║ 27.6 ║
║ Philadelphia 76ers      ║ 22.3 ║
║ Milwaukee Bucks         ║ 18.6 ║
╚═════════════════════════╩══════╝

Everyone's now a little closer to average

On one hand it's cool that we were able to re-create RidgeRegression with NNs, but on the other hand: we didn't really gain anything.

It's also not feasible to do this with player RAPM because it takes a much longer time than RR when the dataset is very large

Added a middle layer of 30 Neurons just for fun - didn't change much

Right now, I doubt NNs can give us better (prediction) performance than the tools we already have (for the problems we're currently trying to solve). I'd love to try them for something like SportVU ghost defenders, but I (obviously) don't have that data

v-zero · Post by **v-zero** » Thu Aug 07, 2014 1:09 pm

Is the hidden layer nonlinear? If not then it is pointless.

APBRmetrics

Power Ranking using Neural Network

Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network

Re: Power Ranking using Neural Network