Saturday, November 22, 2008

Will the Real CO2 Please Stand Up?

Question: Can carbon dioxide levels from any *one* source be considered representative of the levels of the entire planet?

Ice core vs. Surface measurements

While I was investigating whether the planet was warming, I had always assumed CO2 levels were unquestionably rising. After all, we've all seen graphs like this:

(IPCC: Ice core proxy levels followed by direct measurements)

Since some "adjusting" is done to correlate CO2 amounts in air bubbles to the amount thought to be in the atmosphere at that time, ice core measurements are not as good as the real thing, but are thought to be valid proxies for direct measurements.

Then last year (2007), a German researcher named Ernst Beck published another graph, made from direct CO2 measurements, that looks like this:

(CO2 levels are the red line.)

He showed a peak back in the 1820's near 400 ppm, which throws the entire temperature-CO2 correlation out of whack. Not only that, he accused authors of conventional graphs of cherry-picking data that suited their ideological agenda, of "falsifying the history of CO2." Both Beck and the journal that published the study (Energy and Environment) were immediately attacked by global warming proponents as, to put it politely, unworthy of publishing. Keeling, whose work is the cornerstone of the IPCC graph, calls Energy and Environment as a "forum for laundering pseudo-science."

Name-calling aside, is there any validity to Beck's data on CO2? He compiled 90,000 chemical measurements of CO2 from "180 technical papers published between 1812 and 1961."
"The compilation of data was selective. Nearly all of the air sample measurements that I used were originally obtained from rural areas or the periphery of towns, under comparable conditions of a height of approx. 2 m above ground at a site distant from potential industrial or military contamination...

...Discounting such unsatisfactory data [because of deficiencies in certain methods], in every decade since 1857 we can still identify several measurement series that contain hundreds of precise, continuous data."
His critics, including Keeling, claim that these measurements are neither here nor there. They have too much variability and do not represent the true "background" level of CO2. The only reliable source of this true, "background" CO2 for that time period is in air bubbles trapped in antarctic ice. Everything else is just irrelevant noise. Note that they are not disputing the accuracy of the data. They are saying anything with that much variance is unacceptable.

It seems to me that 90,000 readings in 180 published papers should not be so easily dismissed. If nothing else, they show that measured CO2 levels had a huge amount of variance. Yet none of this variance is taken into consideration because all but one source of CO2 measurements (ice core proxies) are categorically rejected.

Now Beck admits that not all CO2 data is equal. He himself threw out data he felt was not representative because of faulty methodology. But who decides what is faulty? Who decides what is representative of the CO2 level in our atmosphere? How does one decide that one measurement (ice core) is representative, and the other (chemical readings near the surface) is not? Who gets to define "background level"? Of all the CO2 measurements out there, is the *one* source selected to represent the planet a matter of consensus as well? A vote by a panel of judges, like a beauty pageant? Is this how science is conducted now?

In science, one has to have a serious and evidence-supported justification for ignoring data. Whenever empirical data is rejected, it is a red flag. Without judging the rationales given for purposely excluding data, both the IPCC and Beck are waving it. Of course, the IPCC rejected a hell of a lot more data (90,000 direct measurements), so one could say their red flag is overwhelmingly larger than Beck's. The more data you reject, the better your reason for rejection must be.

Drive your data, change your definition

Speaking of red flags, whenever a graph changes its definitions mid-point, alarms should sound loud and strong. This is especially true if the methodology changes at a pivotal point in the graph. For example, Jawarowski, a vocal critic of ice core proxies, highlights this change in this graph.

(From: Jaworowski, Z., 2007, CO2: The greatest scientific scandal of our time, EIR Science)

Notice how after they change the definition of CO2 levels from ice core readings to actual measurements from CO2 stations, the curve rises exponentially. Yeah, that should turn on the ambulance sirens in any scientist's head. You can make the "trend" go in any direction you want simply by changing to a different set of data. It could be a defensible change. It could also be sleight of hand.

I understand using proxies because CO2 measuring stations did not exist back then. But why use proxies when there were direct CO2 measurements during the same time period? Wouldn't direct CO2 measurements back then be more comparable to direct CO2 measurements taken today than proxies? Do they have a *really* good reason for rejecting all that data?

What is "background" CO2 anyway?

Keeling, quoting his father's pioneering work on "background" CO2, explains:
"Measurements of the concentration of atmospheric carbon dioxide extend over a period of more than a hundred years. It is characteristic of all the published data that the concentration is not constant even for locations well removed from local sources or acceptors of carbon dioxide. Recent extensive measurements over Scandinavia, reported currently in Tellus, emphasize this variability: observations vary from 280 to 380 parts per million of air. These measurements are in sharp contrast to those obtained in the present study. The total variations at desert and mountain stations near the Pacific coast of North America, 309 to 320 parts per million is nearly an order of magnitude less than for the Scandinavian data. The author is inclined to believe that this small variation is characteristic of a large portion of the earth's atmosphere, since it is relatively easier to explain the large variations in the Scandinavian data as being a result of local or regional factors than to explain in that way the uniformity over more than a thousand miles of latitude and a span of nearly a year, which has been observed near the Pacific coast."
In other words, "background" CO2 is whatever source that has the least amount of variance of CO2. Why? Because the author is "inclined to believe" the smallest variation is representative of the earth's atmosphere. His definition of "background" is not based on actual atmospheric measurements showing it has very little variance. No, it is because it makes sense to him the background shouldn't vary all that much.

Keeling continues to say:
"The concept of the atmospheric background has been backed up by millions of measurements made by a community of hundreds of researchers."
But he has no references for these millions of measurements (though he references other assertions in his critique). So I can't independently verify what he means by that. If the "background" definition has empirical support, this empirical data should be foremost in his argument. As it stands, it sounds like atmospheric background is a concept, widely accepted to be sure, but not very well defended. And in science, accepted and defended are two different things.

Incidentally, there are only five major CO2 measuring stations (atmospheric baseline observatories). Most of the data for mean monthly or annual CO2 levels come from the station on an active volcano (Earth's largest) called Mauna Loa, which last erupted in 1984. I assume climatologists have taken into account volcanic gases (one of which is CO2) as a potential confounder, and that this has nothing to do with the much higher readings of CO2 since they started taking direct measurements there.


So they rejected a huge amount of empirical data with a lot of variance for a proxy that has very little variance, barely climbing for centuries. Then they attached actual measurements, and CO2 levels leap. How much of it is an artifact of data exclusion and definition change?

I don't know the answer. But I shouldn't have had to ask the question.

Thursday, November 20, 2008

A Stand-Up Statistician

Who knew statisticians could be humorous?

I've just become a fan of William M. Briggs, statistician blogger. From global warming to current events, Dr. Briggs turns a painfully dull subject (for the rest of us) into entertaining lessons on the limits and pitfalls of statistics.

Take this for example.

September 6, 2008: Do not smooth time series, you hockey puck!

Dr. Briggs comments:

"The various black lines are the actual data! The red-line is a 10-year running mean smoother! I will call the black data the real data, and I will call the smoothed data the fictional data. Mann used a “low pass filter” different than the running mean to produce his fictional data, but a smoother is a smoother and what I’m about to say changes not one whit depending on what smoother you use.

Now I’m going to tell you the great truth of time series analysis. Ready? Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have an outside (the learned word is “exogenous”) estimate of that error, that is, one not based on your current data."

When we normal people holler about this, we look like presumptious malcontents. It's so much funnier when a PhD statistician says it.

This ties into something that's been bugging me. In my search for the "elusive standard deviation" (of the global mean surface temperature), I kept coming across well-meaning folk telling me the standard deviation of the global mean temperature is a fraction of a degree Celsius, usually around 0.25° C. They say there isn't that much variability when you compare the means across the years. It is always going to hover very closely near the climatology.

Then it dawns on me they are treating the means themselves as raw measurements, like readings from a thermometer. Instead of seeing the means as statistical artifacts with a huge amount of uncertainty, they get a clean slate as absolute numbers with no error attached to them. If you take averages of averages of averages, you are going to end up with nice, tidy numbers with no variance at all. Yes, you can serially average, but each step of the series has to propagate the error from all previous averages. If you smooth the time series, over and over again, without propagating the error, you are going to end up with "fictional data" that has almost no variance and is all but certain.

I don't know about you, but I'd rather have "an uncertain truth" than a likely fiction.

Here are some of my favorite Briggs blogs.

November 12. 2008: Arcsine Climate Forecast
October 31, 2008: Breaking the Law of Averages: Probability and Statistics in Plain English
October 12, 2008: Peer Review Not Perfect: Shocking Finding

I'll close with my favorite quotation on statistics:

"There are three kinds of lies: lies, damn lies, and statistics."
-- Benjamin Disraeli, author, British statesman (1804-1881)

Wednesday, November 19, 2008

The Elusive Standard Deviation

What is the standard deviation of the global mean surface temperature, per year?

The issue of global warming begins by asking the question: Is the planet warming? The answer provided by the consensus of climate scientists is a resounding, "Yes." The cardinal statistic supporting this assertion is a rise in the "global mean/average surface temperature." For example:
  • "Records from land stations and ships indicate that the global mean surface temperature warmed by between 1.0 and 1.7°F since 1850...Since the mid 1970s, the average surface temperature has warmed about 1°F."
It stands to reason that any examination of the global warming problem has to begin here, with the "global mean/average surface temperature." We need to know how much this mean has risen, and if this rise is statistically significant.

The means as a probability distribution
Read more here:

Suppose you average the age of everyone in your household. You have three people aged 3, 14, and 43. The average (also called the "mean") age is 20 (3+14+43=60, divided by 3). Suppose you average the age of everyone in the house next door. They have 5 people aged 18, 19, 20, 21, 22. The average age of their house is also 20. The same exact average in two households can describe two very different situations. The fact of the matter is, the mean has limits in its ability to represent a group with one single point. One has to put that single point in context of how spread out the data is; in statistics, the spread is called "variance." There is a huge amount of variance in your house, not so much in your neighbor's.

The smaller the variance, the more "truthful" the mean is in representing the population. You don't even have anyone in your household who is twenty years old; this mean does not "truthfully" represent your household. The average age of 20 is a more accurate representation of the population in your neighbor's house than in your house. That is because your neighbor's house has a smaller variance (5 years) than yours (40 years).

The mean, as a single point representing an population, is a probability distribution. It represents the middle of a distribution of data shaped like a bell curve. Statisticians divide each half of the bell curve into roughly three equal sections; each section is called a standard deviation. So there are six standard deviations in a bell curve: three before the mean, and three after the mean. When you go out one standard deviation from the mean, both under and over the mean, you get 68% of the data in a typical bell curve. You have a 68% chance that the "true" mean lies within one standard deviation (SD), a 95% chance that it lies within two SDs, and a 99.7% chance it lies within three SDs (also denoted by σ, lowercase sigma).

When you compare the means of two different bell curves, the question is: what are the chances the difference is caused by random or meaningless fluctuations? The higher the variance of the two distributions, the more likely the "true" mean is uncertain, and the more likely the differences between the two uncertain means are random and not significant. The quickest way to eyeball the significance of any comparison is to see if the difference is more than one SD. If the change is larger than one SD, it is likely to be significant. If the change is smaller than one SD, the mean is likely to be due to random fluctuations and errors in measurement.

Where's the mean? Where's the SD?

Now that we've gotten elementary statistics out of the way, let's look at that mean surface temperature again. The rise in the global mean surface temperature is oft cited and well known. But when you look at the graphs referenced by discussions of mean temperatures, you don't see any numbers for the means. You see a horizontal line labeled "0.00° C" and vertical bars going fractions of a degree below or over the zero. (For an example, look here.) One assumes that the bars represent the global mean temperatures, and they are going up. But upon closer examination, the bars represent not the global mean temperatures, but "global temperature anomalies," or departures from the Big Zero in the middle.

How do you evaluate the mean, if you don't know what the mean is or what its variance looks like? What is the absolute value of the global mean surface temperature, and what is its standard deviation? I was able to find this table showing the absolute mean temperatures per year from 1880 to 2007. In addition, this GISS site describes:
"For the global mean, the most trusted models produce a value of roughly 14 Celsius, i.e. 57.2 F, but it may easily be anywhere between 56 and 58 F and regionally, let alone locally, the situation is even worse."
So we can estimate the global mean to hover around 14° C. But what do these average temperatures represent without knowing their standard deviations?

Here is what I mean. A table that shows this:
1880: 13.88° C ± 10° C
2007: 14.73° C ± 10° C

reads very differently from a table that shows this:
1880: 13.88° C ± 0.25° C
2007: 14.73° C ± 0.25° C

Now given the fact that temperatures across the planet have a huge variance over the year, it is impossible that the standard deviation would be less than 1°C. (That would mean the global temperature distribution roughly ranges from 11° C to 17° C, which we all know is patently false.) What is the likely ballpark of the standard deviation?

The hottest record in Canada is 45° C (113° F) and the coldest record in Africa is -24° C (-11° F). It is not unreasonable to estimate the bulk of the world's temperatures for the year falls roughly in this range. This type of range would give us 30° C below and 30° C above the mean of 14° C, which gives a reasonable estimate that the SD should be in the ballpark of 10° C. (If the spread is actually wider, the SD would be even larger.)

If that is the case, why would a change from 13.88° C ± 10° C to 14.73° C ± 10° C be considered significant? The difference is well within the margin of error. Why would climate scientists make such a big to-do about this fraction of a degree increase, in context of the huge global variance? Why can't we find the exact SD?; surely it has been calculated. (You can google it until the cows come home, but you won't find that SD.)

Who is the Big Zero?

It turns out that there is no such thing as the "global mean surface temperature" or its SD as statistical entities. How then, do they know the mean is rising, if it doesn't exist? What are they comparing the "anomalies" to? Who is this Big Zero in all the graphs and data tables?

I was naive and ignorant enough to assume that climate scientists take temperature readings from weather stations all over the earth for the period of a year, and average them into one temperature. Then they compared that average from year to year to observe a rising trend in this mean temperature. Not.

This is what really happens. Climate scientists take readings from weather stations and feed the data into a computer model that adjusts for all sorts of variables, including number of wet days, cloud cover, sunshine, diurnal temperature range, etc. Using computer modeling, they divide the world into 5x5 grids, and fill the boxes with known data and interpolate unknown data. They run the model for a while and come up with a single mean for a 30 year period, usually 1961 - 1990. This mean is called a climatology. The GISS site discusses the Elusive Absolute SATs (Surfact Air Temperatures):
Q. If SATs cannot be measured, how are SAT maps created ? A. This can only be done with the help of computer models, the same models that are used to create the daily weather forecasts. We may start out the model with the few observed data that are available and fill in the rest with guesses (also called extrapolations) and then let the model run long enough so that the initial guesses no longer matter, but not too long in order to avoid that the inaccuracies of the model become relevant. This may be done starting from conditions from many years, so that the average (called a 'climatology') hopefully represents a typical map for the particular month or day of the year.
There are differing methods and time periods used for modeling climatologies, resulting in different climatologies. Climate scientists pick the climatology most appropriate for purpose and compare their annual mean temperatures (also calculated by computer models) to it. The climatology is the absolute standard by with all other temperature calculations are measured; it is the Big Zero. All annual temperatures are evaluated in terms of either hotter or colder than the climatology. The climatology itself does not have a standard deviation, because it is not a straightforward average, but a "adjusted" figure, a result of a very educated computer guess. Scientists estimate the margin of error of these climatologies to be exceptionally small. Climate Research Unit (CRU) explains here.
How accurate are the hemispheric and global averages?Annual values are approximately accurate to +/- 0.05°C (two standard errors) for the period since 1951. They are about four times as uncertain during the 1850s, with the accuracy improving gradually between 1860 and 1950 except for temporary deteriorations during data-sparse, wartime intervals. Estimating accuracy is a far from a trivial task as the individual grid-boxes are not independent of each other and the accuracy of each grid-box time series varies through time (although the variance adjustment has reduced this influence to a large extent). The issue is discussed extensively by Folland et al. (2001a,b) and Jones et al. (1997). Both Folland et al. (2001a,b) references extend discussion to the estimate of accuracy of trends in the global and hemispheric series, including the additional uncertainties related to homogeneity corrections.
Why do climate scientists use climatologies, instead of a straight-up means? The same CRU site elaborates.
Why are the temperatures expressed as anomalies from 1961-90?
Stations on land are at different elevations, and different countries estimate average monthly temperatures using different methods and formulae. To avoid biases that could result from these problems, monthly average temperatures are reduced to anomalies from the period with best coverage (1961-90). For stations to be used, an estimate of the base period average must be calculated. Because many stations do not have complete records for the 1961-90 period several methods have been developed to estimate 1961-90 averages from neighbouring records or using other sources of data. Over the oceans, where observations are generally made from mobile platforms, it is impossible to assemble long series of actual temperatures for fixed points. However it is possible to interpolate historical data to create spatially complete reference climatologies (averages for 1961-90) so that individual observations can be compared with a local normal for the given day of the year.
Why do they talk about the mean surface temperature, if the mean doesn't exist? The best I can make of it, the mean is a theoretical estimate (climatology + the anomaly) that is assumed to be a close correlate of global anomaly trends. If the anomalies go up, it is undisputed that the mean has also gone up.

So is the rise in global mean surface temperature significant?

It depends. Not on objective data, mathematical rigor, and the scientific method; but on personal values. Do you trust climate scientists and their computer modeling or not? It is funny way to do science, because when it comes down to it, it is a matter of belief. Do you believe they have done a good job in "adjusting" all the variables in their computer models? Do you believe they have both the intellectual competence and the professional integrity to have factored in all the relevant variables accurately to get the "truth"?

If you are comparing computer-adjusted data with computer-adjusted data, how do you know if the difference between them is significant? You have to trust the person doing the adjusting.

Please don't get me wrong. I am not casting aspersions on climate scientists at all. I am describing an inherent subjectivity in all endeavors entirely dependent on computer modeling. What comes out is simply a function of what goes in. You program the computer to spit out whatever number you want. And the decision of what goes in is ultimately subjective. Unlike the situation in other sciences where methodology interacts with reality, and you get the results you get whether you like it or not; computers do not interact with the real world. The computer model does not get feedback from reality, only from the programmer. There is no way to cut out the subjective input and values of the programmer from the process.

The only check and balance that exist in a field comprised of entirely computer modeling is the community of scientists and their subjective approval of one's programming. It is no wonder that "consensus" is used so much to describe climate change.

Links for further reading:

Steve McIntyre's Climate AuditMathematician who "audits" climate modeling
20 Questions Statisticians Should Ask About Climate Change (pdf)
by Edward J. Wegman, statistician, George Mason University
William M. Briggs, Statistician: Blogs on Global WarmingLetter to Call for Review of the IPCC
by Vincent Gray, climate scientist and former IPCC Reviewer