Human-consensus hurricane forecasting
“Through computer advances, model forecasts very likely will continue to improve, assuming we remember one fundamental problem with tropical cyclone forecasting: maximizing observations.“
“Numerical prediction of tropical cyclone tracks has improved tremendously since the early models of the 1950s and 1960s. Ironically, today’s reliance on model guidance has possibly led to the decline in skill of subjective tropical cyclone forecasts. It is hard to imagine that landfall forecasts in the 1970s were about as good as they are today and watch/warning areas were smaller. Back then forecasters relied very much on subjective forecast techniques. Today they rely heavily on model forecasts. Revitalizing and improving subjective analysis and forecast skills without inhibiting numerical model advances could provide significant improvements in track forecasts.“
— Dr Steve Lyons, “Hurricane Forecasting Considerations“
This week we’re going to be adding some significant enhancements to the Stormpulse website. One of these is the ability to create an account (free and painless, we promise!), which you’ll need to do if you want to participate in our forecasting system. Then, the next time there’s an active storm (post-Dean), you’re going to tell us where you think the storm is going to go by filling out a slick little form on the Stormpulse home page (just below the map) that asks you where the storm will be in 12, 24, 36, 48, 72, and 120 hours, as well as what the intensity (maximum wind speed) of the storm will be at those points in time. Then we’re going to take everyone’s forecasts and aggregate them in order to see if we can accurately forecast the movement and strengthening (or weakening) of tropical cyclones.
Wait a minute! Aren’t there all kinds of forecasting models out there already?
Yes. But this will be, as far as we know, the world’s first human-consensus hurricane-forecasting model.
Do you guys think you are some kind of experts?
No. In fact, that’s the point of the system—to de-emphasize the individual experts and discover the collective expert (and more importantly, the collective expert’s forecasts).
Why might this work?
The idea to do this hit us on the head in July of 2006. Since then, we’ve had it mostly under wraps, sharing the idea with close acquaintances and friends while gathering insight wherever we could find it. And all of our research pushes us toward the conclusion that this just might work.
For example, in June of this year (2007), we attended the Governor’s Hurricane Conference in Ft. Lauderdale, Florida. While there, we attended a few classes on tropical meteorology. In those classes, our suspicions about the very small world of professional hurricane forecasting were affirmed, insofar as it has several characteristics that make it ripe for disruption by a more democratic process:
- It is a world currently dominated by a few experts. Dr Gray of Colorado State University, Dr Steve Lyons of The Weather Channel, the Hurricane Research Team at NASA, the forecasters at the National Hurricane Center . . . all of these folks are considered experts, and rightly so. They are. But if the research behind the wisdom of crowds theory is correct, it would stand to reason that none of these experts in isolation will consistently perform better than a diverse, independent group. 
- It contains subjectivity, for better or for worse. For better, that subjectivity represents valuable intuition and insight—”I can feel it in my bones!” For worse, that subjectivity represents the unavoidable flaws of human judgment.
- It contains traces of bureaucracy. Don’t get me wrong—this is not a criticism of the National Hurricane Center or anyone other group of professionals whose great challenge it is to produce accurate, timely, and responsible forecasts. But, nevertheless, all groups of professionals wherein there is some order, structure, and authority will contain some amount of bureaucracy that could hinder its performance. Is that going too far? I don’t think so. If you’re misunderstanding me, let me know and I’ll try to restate this. All we’re saying is, the existing system necessarily contains rules and politics that make the system imperfect. I am pointing out that imperfection to underscore the opportunity to improve.
- Satellite data and images are underutilized in existing computer models. Current computer models have a limited ability to digest data gathered via satellite. This is unfortunate since satellite images are the best views we have of what’s going on inside a storm. Having tens, hundreds, or even thousands of human interpreters of satellite data provide their input into a human-consensus model should boost the accuracy of a resulting, synthesized forecast.
- Existing models are weak in predicting storm intensity and size. While track guidance has improved greatly due to advances in computing power, intensity predictions have not seem the same increase in accuracy. Trying something completely different—calling on an army of human forecasters instead of depending greatly on computers, could prove to be a breakthrough in this area. Even if what emerges is a complete failure in providing track guidance, what a human-consensus model provides in forecasting intensity, storm size, and storm surge could prove beneficial for years to come over computer-only calculations.
- Computer-consensus models have performed well. In a report written by the NHC in 2006, it was shown that the consensus models GUNA and CONU provide the best track guidance. At the conference, a forecaster from the National Hurricane Center told the audience that “for some reason it would seem that the [statistical/dynamic] models have offsetting biases in them that cancel each other out when you average them together.” Our thoughts exactly, which brings us to the next and most subtle point:
- You are a forecasting model. If the above (#6) is true, why stop at making a consensus out of only the computer models? Why not attempt to aggregate and synthesize all of the available models—computer and human, to produce one unified forecast?
Won’t all of the novices or intentional misanthropes spoil the system?
No. We are going to keep track of user’s performance and weigh the credibility of their forecasts accordingly. So, if Bob is consistently off by 500 miles at 12 hours out, we’re going to take his forecasts with a grain of salt in our final equation. On the other hand, if someone, no matter what his position in the world of weather, continually proves to be an accurate forecaster, we are going to weigh his forecast more heavily.
What about privacy?
If you choose to participate, whether or not your name shows up anywhere on our site or is ever shared outside of Stormpulse, Inc. will be up to you. For those that don’t mind their identity being attached to their performance, we are planning to publish rankings that show where you stand against the rest of the participants.
Where can this go?
Near the end of 2008 there is going to be a re-opening of the Joint Hurricane Testbed, a government program wherein the National Hurricane Center carefully considers suggestions as to how they can improve their forecasting process. If this works in any measure, that’s one possible outcome.
Are you serious? / That won’t work. / Wow, that’s cool!
Yes. / OK. / Thanks, we’ll see.
 It’s noteworthy that the National Hurricane Center already embraces this truth insofar as forecasters work on rotation. Also, there is a rule that a forecast may not drastically differ from its predecessor. While on the one hand this may suppress a moment of brilliance (and that’s where there’s an opportunity to improve), this avoids public outcry and fear over flip-floppy forecasts and is in effect a mechanism to bend toward a consensus.