Empirical models and decadal forecasts

I noticed Anthony Watts and Judith Curry getting quite excited by a new paper from Emma Suckling and Leonard Smith called An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models. The paper was basically a comparison of an ensemble of dynamical climate models (GCMs) and empirical models. The basic conclusion was that empirical models are, statistically, better at decadal forecasts that dynamical climate models.

I was initially quite positively inclined towards this paper, but the more I’ve thought about it, the more critical I’ve become. Having said that, it’s always possible that I’ve missed some subtlety or misunderstood some aspect of the paper. Hence, I’m happy to be corrected by anyone who knows more than me. The core figure is probably the one I show below. The left-hand panel shows decadal projections for the ensemble of dynamical climate models, while the right-hand panel is for the empirical models. Forecasts are launched every 5 years and so the two panels on each side are simply to split the forecasts as they would overlap if they were in the same figure.

Decadal forecasts from dynamical climate models (left-hand panel) and empirical models (right-hand panel) (credit : Suckling & Smith 2013).

Decadal forecasts from dynamical climate models (left-hand panel) and empirical models (right-hand panel) (credit : Suckling & Smith 2013).

If I understand the rest of the paper, it is essentially an analysis that indicates that – when considering decadal forcasts – the empirical models outperform the dynamical climate models. So, what are my issues with the paper? As far as I understand it, dynamical climate models are extremely complicated. They consider the oceans, land, atmosphere, polar regions, can consider both regional and global climate, and they can do more than simply consider surface temperatures. All this study seems to have done is compare global surface temperatures from these two types of models. Unless I misunderstand something, these empirical models can do virtually nothing else. It’s not really a like-for-like comparison. They’re not really considering two models that can do the same things. They’re comparing a very simple model that can do only one thing, with one aspect of a very complicated model.

Another issue is that if I consider the right-hand panel in the figure above, then it appears that if you were to overlay the top and bottom panels, there would be quite sharp discontinuities at a number of the 5-year launch points. There appears to be an element of this for the dynamical climate models, but it does not appear quite as severe. This would seem to indicate that the empirical models would do a very poor job if used to forecast more than a decade. Additionally, most of the papers cited when discussing the empirical models were written in the 2000s. The comparison, however, starts in 1960. Given that one would expect these empirical models to have been developed based on past knowledge, it would remarkable if they didn’t do a very good job of forecasting the period from 1960 to 2000. You might argue that that’s true for dynamical climate models and there, presumably, is some merit to this suggestion. Dynamical climate models are, however, constrained by the laws of physics. Empirical models, I believe, are not. Given that there are no such constraints on empirical models, it would be pretty amazing if people developing empirical models in the 2000s did not ensure that they were particularly good at forecasting the period prior to 2000.

The paper concludes with a few interesting comments

It also calls into question the extent to which current simulation models successfully capture the physics required for realistic simulation of the Earth system and can thereby be expected to provide robust, reliable predictions (and, of course, to outperform em- pirical models) on longer time scales.

I find this a little bit of an odd statement. Unless I’m mistaken, empirical models have no physics and so they seem to be concluding that dynamical climate models may not have captured all the physics needed because they’re outperformed by models with no physics at all. It may well be that dynamical models do not have all the necessary physics, but it’s not clear why such a comparison is needed to know this. A comparison with actual observations would tell you this. Also, this study has only considered one aspect – surface temperatures – of dynamical climate models. Dynamical climate models are also used for more than just making forecasts. They’re being used to try and understand the climate and how it evolves, and also to consider different future emission pathways. Empirical models, I believe, can do none of this.

The paper then says

The blending (Broecker and Smith 2008) of simulation models and empirical models is likely to provide more skillful probability forecasts in climate services, for both policy and adaptation decisions.

This may well be true. There may well be policy decisions we might want to make based on decadal forecast and so empirical models may well play an important role here. So, I’m certainly not suggesting that empirical models have no role, just that I’m unclear as to the value of the kind of comparison done in this paper.

I have been rather critical of the paper so, again, if someone thinks I’ve misunderstood it, or missed something important, feel free to point it out through the comments. Something I haven’t touched on is how other’s have interpreted it. There is already some evidence that some interpret this as implying the empirical models are better than dynamical climate models. Given that dynamical climate models do much more than simply consider surface temperatures, this interpretation is – in my opinion at least – incorrect. I also find it interesting that these type of papers (i.e., statistical analysis of climate models) often seem to come from people associated with economics, rather than from climate modellers themselves (I commented on something similar a while ago). I appreciate that the authors of this paper have physical science backgrounds, but I’d be fascinated to know what climate modellers actually think of these papers. Do they find them useful and interesting, or do they – secretly – find it frustrating that some think that the way to assess climate models is through statistical analyses rather than through checking how well they satisfy the fundamental laws of physics? Anyway, there’s probably more that could be said but I’ll stop there.

This entry was posted in Climate change, Global warming and tagged , , , , , , , , . Bookmark the permalink.

16 Responses to Empirical models and decadal forecasts

  1. I can’t tell (and I don’t have the paper text) if the GCMs are being initialised – I mean their oceans – or are just started from some simulations state. If they’re not being initialised, its hardly surprising they don’t track reality too well.

  2. It’s quite hard to tell from the paper. Here’s some of what it says

    Global-mean temperature is chosen for the analysis as simulation models are expected to perform better over larger spatial scales (Solomon et al. 2007). Even at the global scale, the raw simulation output is seen to differ from the observations both in terms of absolute values, as well as in dynamics. Three of the four models display a substantial model drift away from the observed global-mean temperature,

    ……. The fact that some of the models exhibit a substantial drift but not others reflects the fact that different models employ different initialization schemes (Keenlyide et al. 2005). ECHAM5 both assimilates anomalies and forecasts anomalies. Assimilating anomalies is intended to reduce model drift (Pierce et al. 2004); the remaining models are initialized from observed conditions.

    A standard practice for dealing with model drift is to apply an empirical (linear) ‘‘bias correction’’ to the simulation runs (Stockdale 1997; Jolliffe and Stephenson 2003). Such a procedure both assumes that the bias of a given model at a given lead time does not change in the future and is expected to break the connection between the underlying physical processes in the model and its forecast. Bias correction is often applied using the (sample) mean forecast error at each forecast lead time.

    I can’t really find anywhere else where the initialisation is discussed. I don’t if the above helps.

  3. OPatrick says:

    An aside, but you have to wonder what it was that Joshua wrote such that Judith has been deleting them, given the standard of comments she seems happy to let stand:

    “I am deleting comments that have no purpose other than to pick a fight with or insult another commenter. You have no idea how boring this is to other readers, and how it keeps others from commenting here.

  4. OPatrick, yes I noticed that comment from Judith. Did seem somewhat ironic given the the tone of many of the other comments.

  5. verytallguy says:

    Really just an advert for the unfailingly excellent “Science of Doom” who has a new series on the Milankovitch cycles, but just about on topic here:

    http://scienceofdoom.com/ “Ghosts of Climate Past” series. Read ‘em all, but notable on this topic

    Adams, Maslin & Thomas 1999 “Sudden climate transitions during the Quaternary”

    ..As we do not know how often decadal timescale changes occurred in the recent geological past, we are handicapped in trying to find mechanisms which might be used for forecasting future events. Even if we knew everything there was to know about past climate mechanisms, it is likely that we would still not be able to forecast such events confidently into the future. This is because the system will have been influenced by probabilistic processes…
    By disturbing the system, humans may simply be increasing the likelihood of sudden events which might have occurred anyway.

    my emphasis

  6. ebsuckling says:

    Indeed the results presented in our paper show that our empirical model was statistically better than the set of dynamical models we analysed for this particular decadal forecasting task. However, we do not make any claims about the relative merit of the two approaches to modelling; instead, we are advocating the use of empirical models as a benchmark to assess and track improvements of dynamical models through certain aspects of their performance.

    We chose to compare global (and land-based regional) surface temperatures as a simple illustration of our empirical model and our proposed evaluation methodology.

    The empirical model cannot “simulate” and can only be used where there is past data for calibration. Nevertheless, comparisons between simple and more complex models can be useful to gain insight about the information available from these models when they are used as predictive tools. Empirical models themselves can vary in complexity, from those that directly use the historical observations (such as climatology, regressions or analogues) or indirectly, exploiting known physical mechanisms to predict the regions or variables of interest (either as scalar quantities, or spatial maps for example).

    Of course it is also true that to gain an understanding of the climate systems and the physical processes underlying it, dynamical models are extremely useful tools and it is simply not possible to get the same kind of insight from empirical models.

    A crucial point I would make though is that while dynamical models are useful in terms of developing our scientific understanding, they are often also used (and with increasing pressure to do so) to provide information for decision support across a range of policy and business sectors. In this case, not only is it important to have a good understanding of the important processes and feedbacks, but also to know that the quantitative output from the models themselves have been representative of the past observations and are not likely to be misleading when used for decision support. This is the basis for our proposal of benchmarking criteria, so that model performance and model improvement can be tracked in a more rigorous way than by eyeballing the output.

    For this reason I think both dynamical and empirical models can play an important role. It is useful to make statistical evaluations, as well as consider how well dynamical models represent the physical processes within the system, if we are to make progress towards the goals of scientific understanding, as well as provide policy-relevant information.

    To respond to William:
    The GCM hindcasts were initialised to observations at each launch date.

  7. Emma, thanks for the comment. I agree with you about both playing a role. I certainly wasn’t suggesting otherwise.

    However, we do not make any claims about the relative merit of the two approaches to modelling; instead, we are advocating the use of empirical models as a benchmark to assess and track improvements of dynamical models through certain aspects of their performance.

    This is actually what I had taken from the paper until I read the comment about missing physics from dynamical models. That seemed like rather a stretch and did seem to be criticising the dynamical models based on the results from the empirical models. It’s this that rather turned me against the paper, but maybe that was a bit of pro-physics bias :-). Otherwise, I don’t disagree with what you’re suggesting above. Having said that, I don’t know if actual climate modellers would agree.

    You say

    This is the basis for our proposal of benchmarking criteria, so that model performance and model improvement can be tracked in a more rigorous way than by eyeballing the output.

    I don’t quite follow what you’re suggesting here. Surely one can compare, statistically, the results from dynamical models and observations without needing to use empirical models. I presume this is already done. Are you really suggesting that climate modellers determine the success of their models by eyeballing the fit between the models and the observations, or are you simply suggesting that dynamical models today should be benchmarked against empirical models for the next decade (i.e., a time period for which we don’t have observations). If so, we would presumably have to hope we don’t have some kind of sudden event, as suggested by VTG above.

    There’s something that I didn’t mention in the post but that I would be interested in your views about. Although dynamical models are used for policy decisions this is more in relation to timescales much greater than a decade (or at least that’s my understanding). If all we were concerned about was the next decade, then maybe empirical models would be a better for policy, but it’s not the next decade, it’s the next few decades, or even longer.

    Also, it’s my understanding that climate modellers are typically much more confident about how successful dynamical models are at predicting long-term trends, than they are about shorter-term trends. I believe short-term trends are much more sensitive to internal variability that will likely average out over longer timescales. So it would certainly be interesting to know if climate modellers feel that your comparison is over a suitable timescale.

    Something I had wondered was how the empirical model would compare to a simple physically motivated model. For example, one could assume a TCR (plus uncertainties) and some future emission pathway. Maybe one could also include some random natural variability. You’ve, I presume, only compared the empirical models to complex dynamical models. What about comparing it to much simpler physically motivated models?

  8. Joshua says:

    Judith has taken, on a few occasions lately, to deleting my comments. It happens when I post a comment critical of what another commenter has said, and then the attacks against me start. I then criticize the comments attacking me, and more attacks continue. And then people complain to her that reading the long line of attacks against me and my criticism of the thinking evidenced in those attacks is boring.

    Judith chooses not to acknowledge that there is a difference between my comments, which are critiques of the logic of the other commenters, and the attacks against me. Instead, she seems to think that I am responsible for the stream of attacks against me – merely because I am critical of the thinking of others, for example when they express opinions that fail to value uncertainties. The whole dynamic is amusingly ironic.

    But even more, Judith has, on a number of occasions, left the attacks against me, deleted my responses, and then allowed follow-up attacks. It seems to me that her intent is to frustrate me by rigging the game. I believe that this is so because there are no discernible consistent criteria that she uses in her monitoring. She applies a double-standard, and I think that she is trying to frustrate my criticisms of her and “denizens,” so as to get me to stop posting criticisms. It won’t work.

  9. BBD says:


    I don’t know how you put up with this nonsense. I stopped commenting at Judith’s because she kept putting *me* in moderation for exactly the same reasons you give – critiquing the crazy. The crazy was allowed to stand. FTS, thought I, and left. Perhaps you should consider doing the same. It’s very obvious what kind of comments section JC wants, so let her have it.

  10. BBD says:

    Wotts says it for me:

    There’s something that I didn’t mention in the post but that I would be interested in your views about. Although dynamical models are used for policy decisions this is more in relation to timescales much greater than a decade (or at least that’s my understanding). If all we were concerned about was the next decade, then maybe empirical models would be a better for policy, but it’s not the next decade, it’s the next few decades, or even longer.


  11. The other issue is that WUWT have done precisely what one would expect given the comment in the paper questioning whether dynamical models capture all the physics. I won’t add a link to the post but the title is New paper: climate models short on ‘physics required for realistic simulation of the Earth system’. So, WUWT have interpreted this paper precisely how I would have expected them to have done, and it’s certainly my opinion that this interpretation is incorrect. Given that it seemed obvious, the moment I read that paragraph, that sites like WUWT would do exactly this, it’s rather surprising that the authors didn’t consider the implications of that comment when writing the paper. On the other hand, maybe this interpretation is what the authors were indeed implying, but that would seem to be rather stretching the merits of empirical modelling versus dynamical modelling.

  12. Ed Hawkins says:

    I agree that you do not need an empirical forecast model to demonstrate that the dynamical models may be missing some physical processes, but additional evidence for any hypothesis is surely welcome?

    Also, no-one is suggesting to replace dynamical models. The whole point of these particular simple empirical models is to examine how skillful the dynamical models are for the specific task of predicting the climate of the next decade. These experimental dynamical forecasts are very different from the normal long-term projections as they are initialised from observations in an attempt to predict the internal variability component (as well as the forced trend) of the climate for a few years. This is technically very challenging.

    So, the empirical models give the dynamical models a target to aim at, and to measure their improvement over time. For example, the latest generation of decadal predictions might be tested in the same way to see if they have improved. Also, I think there is potential to combine dynamical and empirical models in interesting ways to utilise the strengths of both approaches – and this is what we are planning to do….

    Finally, there are several papers discussing empirical models for making long-term projections – see papers by Judith Lean for example. And, the IPCC AR5 discusses this briefly in Chapter 11.


  13. Thanks, Ed. To be honest, as I said to Emma, my slightly negative tone came from the comment in the paper about “not capturing some physics” rather than any sense that empirical models have no value. As with any complex computational model, I’m sure that dynamical models can be improved but questioning the physics in a dynamical model because an empirical model does a better job, does seem a little bit of a stretch. Admittedly, one reason I had issue with that comment was because it was fairly obvious that sites like WUWT would do precisely what they’ve just done – conclude that this paper illustrates how dynamical models are missing some important physics. In a sense that may be true, but that could probably be determined by just comparing with actual observations. Also, one has to be a little careful about what one means by not capturing some physics. I doubt anyone thinks that they’re missing something truly fundamental. They may, I presume, be not capturing some physical process quite correctly (convection for example) but that’s not quite the same as missing some physics.

    An interesting related issue, though, is whether or not authors should take care with what they say in papers because of how the blogosphere may choose to interpret something, or whether that should be completely ignored. In an ideal world, probably the latter, but we don’t really live in an ideal world.

    Thanks for the references. I’ll have a look at those.

  14. Joshua says:

    wotts –

    First, no matter how authors word their findings, folks like those at WUWT will distort what they say to advance their agenda.

    Second, I don’t know what difference it makes if folks like those at WUWT use what scientists say to advance their agenda. For the most part, anyone reading WUWT already has their minds made up w/r/t climate change. Watts and others can and will take what scientists say out of context, or distort what a scientists say,, or exploit what scientists say through selective reasoning. Does it really make any difference?

  15. Joshua, that is indeed a perfectly valid point and I would agree. All I would add is that it would seem sensible to not make it too easy 🙂

  16. BBD says:

    I try to avoid ‘+1’ comments, but that’s definitely a +1, Wotts.

Comments are closed.