Watt about the anti-information?

There is a new post on Watts Up With That (WUWT) by Cato Boffins Patrick J. Michaels and Paul C. “Chip” Knappenberger. The post is called anti-information in climate models. The post considers the two models evaluated by the first “National Assessment” of climate change impacts in the United States in the 21st century, published by the U.S. Global Change Research Program (USGCRP) in 2000. One is the Candian Climate Model and the other the model from the Hadley Centre at the UK Met. Office.

So, they test these models by comparing them with the ten-year running means of the temperature of the lower 48 states of the USA. Now, I don’t know these models particularly well but I thought they were normally used to produce global surface temperature anomalies. I presume that they can also produce temperatures for the contiguous USA. They can’t have been silly enough to have compared global surface temperatures with temperatures in the US, can they?

Anyway, they go on to say

One standard method used to determine the utility of a model is to compare the “residuals”, or the differences between what is predicted and what is observed, to the original data. Specifically, if the variability of the residuals is less than that of the raw data, then the model has explained a portion of the behavior of the raw data and the model can continue to be tested and entertained.

A model can’t do worse than explaining nothing, right?

Not these models! The differences between their predictions and the observed temperatures were significantly greater (by a factor of two) than what one would get just applying random numbers.

Now, I found this a bit confusing for a couple of reasons. Firstly, what do they mean by the residual has to be less than the raw data? The data is typically temperature anomalies which are relative to some long term mean. The values can go from being negative, to being close to zero to being positive. That seems to suggest that the success of the model (as defined by them) depends on the year that they’re comparing with. Also, I couldn’t find data for the USA, but if one looks at a comparison of climate models with global surface temperatures, the mean is quite a good fit. This is shown in the figure below. If the observations were to be smoothed over a longer period, it seems clear that the fit would be quite impressive. I can’t quite believe that the residual that they get is typically bigger than the observed value. Surely they’re comparing the ten-year running mean of the measured values with the ten-year running mean of the model values. Anything else would be crazy.

Comparison between observed global temperature anomalies with values from climate models.  The mean of the climate models is the red line.

Comparison between observed global temperature anomalies with values from climate models. The mean of the climate models is the red line.


So, what else do they say? Well they claim that the model fit is twice as bad as one would get if one simply applied a random number generator. If one simply chose the temperature anomaly randomly, surely the fit would be awful. What I assume they mean is if you randomly perturb the known observed temperature anomaly values. Well, sure. I could make that a fantastic fit if I simply perturbed it by a tiny amount. Their “random number model” – that is supposedly better than a climate model – already “knows” the values of the measured temperature anomalies. The climate models do not. That’s the point of climate models.

They then go on to say that it is twice as bad as a random number generator because the climate models were only “correct” 12.5% of the time, while a random number generator would be correct 25% of the time. This is based on an analogy in which the model is expected to predict the temperature 100 times but in which there are 4 possible choices for each temperature. A random choice would therefore be correct one time out of 4 (hence 25%). But there aren’t only 4 possible values for each temperature anomaly. Their analogy simply makes no sense at all! You can’t compare a climate model with a multiple choice test with 100 questions and 4 answers per question.

I really don’t know what else to say about this. Either the definition of the term Boffin used at the beginning of the WUWT post differs from what I thought it meant or these are two people who are very bright but who are knowingly intending to mislead the people who read their post. It’s just an absurd post that really makes no sense at all.

This entry was posted in Anthony Watts, Climate change, Global warming, Watts Up With That and tagged , , , , , , . Bookmark the permalink.

12 Responses to Watt about the anti-information?

  1. It is telling that they do not include a figure with their results. That makes it more difficult to see what they did. Given that their aim was to get the worst possible result and the climate models can only predict decadal variability and trends, this is what I expect they did. (Do not forget they used only US data, 2% of the globe, which has a lot more natural variability (noise) relative to decadal variability and the trend.
    I would expect that they did not use the anomaly so that the predictability of the trend is compensated by any bias error in the US of the models.

    “Surely they’re comparing the ten-year running mean of the measured values with the ten-year running mean of the model values. Anything else would be crazy.”

    I would expect that they computed the errors on annual mean temperature and would not even be surprised if they used daily data, just for the fun of it and that would add a lot more noise.

    If they are interested in contributing to our understanding of the climate system, I would suggest they write up clearly what they did and submit their paper to a scientific journal.

  2. Exactly. It definitely comes across as something in which the intention was to make the models seem as bad as they could possibly be.

  3. Nick says:

    This is just an anecdote from two old hands at the dirt machine. Where is the substance?

  4. Lars Karlsson says:

    They really don’t say anything about what they mean by ‘random numbers’ (and nobody at WUWT seems interested in asking). At least the random numbers must be within some range, but how do you choose that range without any information about the observed temperatures?

  5. Lars Karlsson says:

    This is what i think they are doing. Temperature variability on the decadal level in the US (or comparable areas) is dominated by ‘noise’ which can make year-to-year temperatures vary by as much as 1,5 degrees. So all you need to do is to find a decade where the noise produces a flat or downwards trend. As ‘random numbers’ in average produce flat trends, the ‘random numbers’ are going to win against models for this decade.

  6. I did see that figure. You’re right that random numbers will produce a flat trend and so over the full time range, it can’t possibly produce a good fit. But, yes, if they have just considered a decade in which it is flat, then random numbers would do a good job. However, I can’t really believe that they’ve used a ten-year running mean to consider only ten years of data. Plus, their anecdote doesn’t make any sense. As you comment above, you need to define the scale of your random number generator and hence could make it good or bad depending on what you choose.

  7. Exactly. No substance and no real explanation of what they’ve actually done – as Lars points out above.

    I found this comment on the WUWT post, by someone called John, particularly amusing

    This is another example of why Patrick Michaels is one of the most important of the analysts and thinkers among sceptics.

  8. Lars Karlsson says:

    I see that they used “ten-year running means of annual temperature” and not ten years of annual temperatures as I first thought.
    But they absolutely nothing about the random numbers, so it is impossible to repeat their ‘experiment’,

  9. Lars Karlsson says:

    Here is a year-old dishonest piece by Watts on a similar theme: Climate models outperformed by random walks.

    First he omits to mention that the study in question showed that a global model (DePreSys) performed quite well. Secondly, there were a couple of local predictions by GCMs which were quite poor, but those were not reinitialised at the forecast origin. So it is not strange that they had a disadvantage against a random walk which was reinitialised.

  10. Lars Karlsson says:

    On second thoughts, I don’t think that they actually did generate any random numbers.

    “Not these models! The differences between their predictions and the observed temperatures were significantly greater (by a factor of two) than what one would get just applying random numbers.”

    The variability of the continental US is a bit more than 2 degrees, so I assume that the errors of the models must have been +-2 degrees.

  11. Yes, that was my impression. They determined, somehow, that the climate models were only “correct” 12.5% of the time. They then used an analogy with a multiple choice test with 100 questions each with 4 possible answers to claim that a random number generator would be correct 25% of the time, hence twice as good as the climate models.

  12. Pingback: Another Week of Anthropocene Antics, May 26, 2013 – A Few Things Ill Considered

Comments are closed.