What is the standard deviation of the global mean surface temperature, per year?
The issue of global warming begins by asking the question: Is the planet warming? The answer provided by the consensus of climate scientists is a resounding, "Yes." The cardinal statistic supporting this assertion is a rise in the "global mean/average surface temperature." For example:
It stands to reason that any examination of the global warming problem has to begin here, with the "global mean/average surface temperature." We need to know how much this mean has risen, and if this rise is statistically significant.
- "Records from land stations and ships indicate that the global mean surface temperature warmed by between 1.0 and 1.7°F since 1850...Since the mid 1970s, the average surface temperature has warmed about 1°F."http://epa.gov/climatechange/science/recenttc.html
- "Over both the last 140 years and 100 years, the best estimate is that the global average surface temperature has increased by 0.6 ± 0.2°C."http://www.ipcc.ch/ipccreports/tar/vol4/english/075.htm
The means as a probability distribution
Read more here: http://en.wikipedia.org/wiki/Normal_distribution
Suppose you average the age of everyone in your household. You have three people aged 3, 14, and 43. The average (also called the "mean") age is 20 (3+14+43=60, divided by 3). Suppose you average the age of everyone in the house next door. They have 5 people aged 18, 19, 20, 21, 22. The average age of their house is also 20. The same exact average in two households can describe two very different situations. The fact of the matter is, the mean has limits in its ability to represent a group with one single point. One has to put that single point in context of how spread out the data is; in statistics, the spread is called "variance." There is a huge amount of variance in your house, not so much in your neighbor's.
The smaller the variance, the more "truthful" the mean is in representing the population. You don't even have anyone in your household who is twenty years old; this mean does not "truthfully" represent your household. The average age of 20 is a more accurate representation of the population in your neighbor's house than in your house. That is because your neighbor's house has a smaller variance (5 years) than yours (40 years).
The mean, as a single point representing an population, is a probability distribution. It represents the middle of a distribution of data shaped like a bell curve. Statisticians divide each half of the bell curve into roughly three equal sections; each section is called a standard deviation. So there are six standard deviations in a bell curve: three before the mean, and three after the mean. When you go out one standard deviation from the mean, both under and over the mean, you get 68% of the data in a typical bell curve. You have a 68% chance that the "true" mean lies within one standard deviation (SD), a 95% chance that it lies within two SDs, and a 99.7% chance it lies within three SDs (also denoted by σ, lowercase sigma).
When you compare the means of two different bell curves, the question is: what are the chances the difference is caused by random or meaningless fluctuations? The higher the variance of the two distributions, the more likely the "true" mean is uncertain, and the more likely the differences between the two uncertain means are random and not significant. The quickest way to eyeball the significance of any comparison is to see if the difference is more than one SD. If the change is larger than one SD, it is likely to be significant. If the change is smaller than one SD, the mean is likely to be due to random fluctuations and errors in measurement.
Where's the mean? Where's the SD?
Now that we've gotten elementary statistics out of the way, let's look at that mean surface temperature again. The rise in the global mean surface temperature is oft cited and well known. But when you look at the graphs referenced by discussions of mean temperatures, you don't see any numbers for the means. You see a horizontal line labeled "0.00° C" and vertical bars going fractions of a degree below or over the zero. (For an example, look here.) One assumes that the bars represent the global mean temperatures, and they are going up. But upon closer examination, the bars represent not the global mean temperatures, but "global temperature anomalies," or departures from the Big Zero in the middle.
How do you evaluate the mean, if you don't know what the mean is or what its variance looks like? What is the absolute value of the global mean surface temperature, and what is its standard deviation? I was able to find this table showing the absolute mean temperatures per year from 1880 to 2007. In addition, this GISS site describes:
"For the global mean, the most trusted models produce a value of roughly 14 Celsius, i.e. 57.2 F, but it may easily be anywhere between 56 and 58 F and regionally, let alone locally, the situation is even worse."So we can estimate the global mean to hover around 14° C. But what do these average temperatures represent without knowing their standard deviations?
Here is what I mean. A table that shows this:
1880: 13.88° C ± 10° C
2007: 14.73° C ± 10° C
reads very differently from a table that shows this:
1880: 13.88° C ± 0.25° C
2007: 14.73° C ± 0.25° C
Now given the fact that temperatures across the planet have a huge variance over the year, it is impossible that the standard deviation would be less than 1°C. (That would mean the global temperature distribution roughly ranges from 11° C to 17° C, which we all know is patently false.) What is the likely ballpark of the standard deviation?
The hottest record in Canada is 45° C (113° F) and the coldest record in Africa is -24° C (-11° F). It is not unreasonable to estimate the bulk of the world's temperatures for the year falls roughly in this range. This type of range would give us 30° C below and 30° C above the mean of 14° C, which gives a reasonable estimate that the SD should be in the ballpark of 10° C. (If the spread is actually wider, the SD would be even larger.)
If that is the case, why would a change from 13.88° C ± 10° C to 14.73° C ± 10° C be considered significant? The difference is well within the margin of error. Why would climate scientists make such a big to-do about this fraction of a degree increase, in context of the huge global variance? Why can't we find the exact SD?; surely it has been calculated. (You can google it until the cows come home, but you won't find that SD.)
Who is the Big Zero?
It turns out that there is no such thing as the "global mean surface temperature" or its SD as statistical entities. How then, do they know the mean is rising, if it doesn't exist? What are they comparing the "anomalies" to? Who is this Big Zero in all the graphs and data tables?
I was naive and ignorant enough to assume that climate scientists take temperature readings from weather stations all over the earth for the period of a year, and average them into one temperature. Then they compared that average from year to year to observe a rising trend in this mean temperature. Not.
This is what really happens. Climate scientists take readings from weather stations and feed the data into a computer model that adjusts for all sorts of variables, including number of wet days, cloud cover, sunshine, diurnal temperature range, etc. Using computer modeling, they divide the world into 5x5 grids, and fill the boxes with known data and interpolate unknown data. They run the model for a while and come up with a single mean for a 30 year period, usually 1961 - 1990. This mean is called a climatology. The GISS site discusses the Elusive Absolute SATs (Surfact Air Temperatures):
Q. If SATs cannot be measured, how are SAT maps created ? A. This can only be done with the help of computer models, the same models that are used to create the daily weather forecasts. We may start out the model with the few observed data that are available and fill in the rest with guesses (also called extrapolations) and then let the model run long enough so that the initial guesses no longer matter, but not too long in order to avoid that the inaccuracies of the model become relevant. This may be done starting from conditions from many years, so that the average (called a 'climatology') hopefully represents a typical map for the particular month or day of the year.There are differing methods and time periods used for modeling climatologies, resulting in different climatologies. Climate scientists pick the climatology most appropriate for purpose and compare their annual mean temperatures (also calculated by computer models) to it. The climatology is the absolute standard by with all other temperature calculations are measured; it is the Big Zero. All annual temperatures are evaluated in terms of either hotter or colder than the climatology. The climatology itself does not have a standard deviation, because it is not a straightforward average, but a "adjusted" figure, a result of a very educated computer guess. Scientists estimate the margin of error of these climatologies to be exceptionally small. Climate Research Unit (CRU) explains here.
How accurate are the hemispheric and global averages?Annual values are approximately accurate to +/- 0.05°C (two standard errors) for the period since 1951. They are about four times as uncertain during the 1850s, with the accuracy improving gradually between 1860 and 1950 except for temporary deteriorations during data-sparse, wartime intervals. Estimating accuracy is a far from a trivial task as the individual grid-boxes are not independent of each other and the accuracy of each grid-box time series varies through time (although the variance adjustment has reduced this influence to a large extent). The issue is discussed extensively by Folland et al. (2001a,b) and Jones et al. (1997). Both Folland et al. (2001a,b) references extend discussion to the estimate of accuracy of trends in the global and hemispheric series, including the additional uncertainties related to homogeneity corrections.Why do climate scientists use climatologies, instead of a straight-up means? The same CRU site elaborates.
Why are the temperatures expressed as anomalies from 1961-90?Why do they talk about the mean surface temperature, if the mean doesn't exist? The best I can make of it, the mean is a theoretical estimate (climatology + the anomaly) that is assumed to be a close correlate of global anomaly trends. If the anomalies go up, it is undisputed that the mean has also gone up.
Stations on land are at different elevations, and different countries estimate average monthly temperatures using different methods and formulae. To avoid biases that could result from these problems, monthly average temperatures are reduced to anomalies from the period with best coverage (1961-90). For stations to be used, an estimate of the base period average must be calculated. Because many stations do not have complete records for the 1961-90 period several methods have been developed to estimate 1961-90 averages from neighbouring records or using other sources of data. Over the oceans, where observations are generally made from mobile platforms, it is impossible to assemble long series of actual temperatures for fixed points. However it is possible to interpolate historical data to create spatially complete reference climatologies (averages for 1961-90) so that individual observations can be compared with a local normal for the given day of the year.
So is the rise in global mean surface temperature significant?
It depends. Not on objective data, mathematical rigor, and the scientific method; but on personal values. Do you trust climate scientists and their computer modeling or not? It is funny way to do science, because when it comes down to it, it is a matter of belief. Do you believe they have done a good job in "adjusting" all the variables in their computer models? Do you believe they have both the intellectual competence and the professional integrity to have factored in all the relevant variables accurately to get the "truth"?
If you are comparing computer-adjusted data with computer-adjusted data, how do you know if the difference between them is significant? You have to trust the person doing the adjusting.
Please don't get me wrong. I am not casting aspersions on climate scientists at all. I am describing an inherent subjectivity in all endeavors entirely dependent on computer modeling. What comes out is simply a function of what goes in. You program the computer to spit out whatever number you want. And the decision of what goes in is ultimately subjective. Unlike the situation in other sciences where methodology interacts with reality, and you get the results you get whether you like it or not; computers do not interact with the real world. The computer model does not get feedback from reality, only from the programmer. There is no way to cut out the subjective input and values of the programmer from the process.
The only check and balance that exist in a field comprised of entirely computer modeling is the community of scientists and their subjective approval of one's programming. It is no wonder that "consensus" is used so much to describe climate change.
Links for further reading:
Steve McIntyre's Climate AuditMathematician who "audits" climate modeling
20 Questions Statisticians Should Ask About Climate Change (pdf)
by Edward J. Wegman, statistician, George Mason University
William M. Briggs, Statistician: Blogs on Global WarmingLetter to Call for Review of the IPCC
by Vincent Gray, climate scientist and former IPCC Reviewer