the electoral vote outcome as of 12/20/16, image from Wikipedia
I'm not a fan of the hordes of "data scientists" running regressions pretending they're doing "science." In fact, I'm skeptical of any field doing actual science that has science in its name, computer science included. But I want to defend the some of the forecasts themselves and suggest a way of going forward.
(Also, while I wouldn't blame the pollsters, who have an increasingly hard job these days, one group I have no problem blaming are the political "scientists," who have all these theories about what candidates should and shouldn't do, where advertising helps and where it doesn't, and Trump did none of the things he was "supposed" to do and still won.)
Blame the forecasters?
I don't think there was an honest way to look at the polling, or really, most other publicly available data, and claim that Trump was actually more likely to win than Clinton. The truth is simply that unlikely events occasionally occur, and this was one of them.
While the forecasts all agreed Clinton is the favorite, they assigned her different win probabilities. Sam Wang (whose forecast I repeatedly dismissed before the election) assigned something like a 99% chance to a Clinton victory. Nate Silver predicted something like a 2/3 chance to a Clinton victory. Does that mean that Nate is a better predictor?
Well, still not necessarily. Unless someone assigned a 100% probability to a Clinton win, we can't know for sure. Sam Wang could have been closer to the truth, but simply gotten unlucky. Moreover, people should be rewarded for predicting close to 0% or 100% because those predictions are much more informative. Nate Silver's prediction might have been well calibrated, but still quite useless.
Consider the following prediction. I can predict that for the next 10 elections, the candidates of the two major parties have roughly a 50-50 chance of winning. Since the Democrats and the Republicans roughly win half the time, I'll probably be well calibrated, but my prediction will remain useless.
Count your log-loss
So, ought we throw out hands up in the air and trust everyone equally next time? No! Statistics and machine learning have ways of evaluating precisely these things. We can use something called a loss function (for reasons I won't go into here, I will use the "log-loss" function, but you can use others), where we assign penalties, or losses, for inaccurate predictions. Whoever accumulates the least loss over time can be thought of as the better predictor.
The binary version of the log-loss function works as follows:
L(y,p) = -(y log(p) + (1-y)log(1-p))
So let y=1 in the event where Trump wins and p be the probability assigned to that event. Someone assigning this event a probability of .01 will suffer loss = -(1*log(.01)+(1-1)log(1-.01)) = 2. Whereas someone assigning this event a probability of .33 will suffer loss of approximately 0.5. Note that had Trump lost, the losses would have been approximately .005 and .2, respectively, rewarding the confident prediction.
So, according to this metric, Sam Wang gets penalized a lot more than Nate Silver for predicting an event that didn't occur. If he keeps doing this over time, he will be discovered to be a bad predictor. Note that this function indeed assigns a loss of 0 for predicting a 100% probability to an event that occurs and infinite loss to assigning 0% to an event that occurs. Big risks yield big rewards. Also note that my scheme of assigning a 50-50 chance to each future election will simply yield a loss of about .3 each time, which shouldn't be too hard to beat.
So, I suggest we start keeping track of the cumulative log-losses of the various people in this game to keep them honest.
So let y=1 in the event where Trump wins and p be the probability assigned to that event. Someone assigning this event a probability of .01 will suffer loss = -(1*log(.01)+(1-1)log(1-.01)) = 2. Whereas someone assigning this event a probability of .33 will suffer loss of approximately 0.5. Note that had Trump lost, the losses would have been approximately .005 and .2, respectively, rewarding the confident prediction.
So, according to this metric, Sam Wang gets penalized a lot more than Nate Silver for predicting an event that didn't occur. If he keeps doing this over time, he will be discovered to be a bad predictor. Note that this function indeed assigns a loss of 0 for predicting a 100% probability to an event that occurs and infinite loss to assigning 0% to an event that occurs. Big risks yield big rewards. Also note that my scheme of assigning a 50-50 chance to each future election will simply yield a loss of about .3 each time, which shouldn't be too hard to beat.
So, I suggest we start keeping track of the cumulative log-losses of the various people in this game to keep them honest.
hahahahahahahahaha
ReplyDelete