Five Questions For... Michael Story

Jul 27, 2022

Michael Story is Director of the Swift Centre for Applied Forecasting who is working to make forecasting useful, both to policy-makers and ordinary people. Previously, he worked as a Managing Director at Good Judgment after qualifying as a Superforecaster in the original Good Judgment Project. He has thought a lot about how to make forecasting better, and I thought it would be interesting to get his views on the current state of forecasting, what mistakes forecasters are likely to make, why companies haven’t been as keen on internal prediction markets, and more!

He is also currently looking for experienced forecasters to help explore ways in which forecasters can help decision makers understand the "why" of forecasters' judgments and help them make decisions. If you’re an experienced forecaster and want to take part, you can read more here.

1) Do you think that there are any common mistakes that excellent forecasters make in the way that they view the world? Similarly, what sort of question do you think people who don’t forecast at all are likely to give more insightful answers than superforecasters are, if there are any?

Forecasters are generally pretty good at assessing the world in an accurate way, but if there were a reason that they were likely to be consistently slightly off in their forecasts, I think that might emerge from being a community where people talk to themselves, and your life experience as an excellent forecaster will shape the way you see the world and might lead to slight distortions.

It’s not that there’s a particular question that they would be worse at answering, more that there’s a risk that people in forecasting might create a mental model of what other people are like that is skewed by the fact that they’re so overexposed to people who are also forecasters. So while I’m not confident that there are definitely particular areas where forecasters are likely to perform worse than other people, the type of questions I believe they would be more likely to do poorly on are questions that rely on assumptions about lives that are quite different to the type of lives that forecasters lead.

2) How do you use forecasting in your personal life? Are there any questions you would be unwilling to forecast (for example, the chance a relationship fails), or are you willing to forecast pretty much anything?

Once you start thinking like a forecaster, it’s hard not to think about things in a probabilistic way. Lots of areas of life require implicit forecasts, and it’s incredibly useful to think like a forecaster when you’re deciding whether to take a new job, move countries, or start a new relationship, all of which I have done in the last few years. You need to think about whether the chance of some outcome is above X%, and get a rough sense of the impact of each outcome, and make your expected value calculation. That’s not always easy to do, though: confronting uncomfortable possibilities can make forecasting emotionally taxing and some potential outcomes are very difficult to think about objectively without becoming emotionally compromised. At times like that it’s very wise to seek out your forecaster pals for advice.

I also think making explicit forecasts about personal lives can help young people realise their potential. Lots of talented young people are underconfident, and that seems like something we should try and fix for their and everyone else’s benefit. Prompting people to think in a more probabilistic way about their options can help jog people’s ideas about what‘s possible for them. For people who are a bit older and can recognise talent, one of the best things you can do is identify talented younger people who are underconfident and raise their ambitions, which you can do fairly cheaply- certainly more easily than you can raise talent.

3) There’s some interesting research by Nuño Sempere, Misha Yagudin and Eli Lifland that claims that internal prediction markets have failed to gain traction in companies, and that companies that have trialled prediction markets often ended up abandoning them. Why do you think this is? Do you expect more companies to adopt internal prediction markets in the future?

I also worked on projects trying to establish internal forecasting tournaments inside companies and there are a few reasons why the projects I’ve had access to struggled to take off.

Firstly, it’s hard to generalise from online forecasting projects to what happens inside a big company. Sites like Good Judgment Open or Metaculus and their users are quite unusual in that they host large collections of people who like forecasting and do it as a hobby. This is not true of a typical large corporation! If we look back at the first question about what forecasters are likely to get wrong, my answer there might apply here: someone who loves forecasting may overestimate how popular forecasting tournaments are likely to be inside large companies. Forecasters tend to be evangelists - they think it’s obvious that forecasting is beneficial and that companies would benefit from doing more of it. But often people working in large corporations just don’t find forecasting that fun, so engagement is always a problem. That’s not an issue if the organisation is huge enough to coincidentally employ a quorum of forecasters, but that is a limiting factor.

The second reason forecasting platforms inside firms often fail to get going is that they are very expensive to run. And again, many forecasters find it hard to see how costly it is to run those systems and underestimate how hard it is in coordination and bandwidth terms for those companies to do. You can make arguments that companies are underestimating the benefit of this information, but they may not see it that way for long enough to be willing to tolerate the costs.

Thirdly, some firms who set up forecast platforms don’t seem as committed as they might be- perhaps they want to want to forecast, rather than actually value it. For some firms, it’s a bit like going to the gym or eating healthily. Everyone knows that they should be exercising, but it’s difficult to actually go ahead with doing it. I’ve often seen projects get going with kickoff meetings and lots of participation, but when the researchers who set up the internal market leave, there’s no real external psychological motivation to persevere with it and engagement drops off. In practice, the researcher has been acting like a personal trainer and adding motivation into the system, but when the researcher goes away and doesn’t motivate the forecasts, things may tend to fall apart.

I also think we need to do a lot more work on better adapting forecasting practices to an organisational setting. A lot of the ways that we know how to do forecasting comes from research from government forecasting competitions - but those forecasting competitions are designed to discriminate between forecasters and teams, not optimising for producing the information that is most valuable under the sort of constraints that firms deal with: small numbers of people, limited time and bandwidth. If you wanted to set up a forecasting competition within a company, you would need to optimise for efficiently finding out true information rather than figuring out who is the best forecaster in the tournament.

Once you move beyond the government tournament model, a lot of unknowns open up. For example, a lot of things that we care about might not have clear resolution criteria, or can’t be worded unambiguously: they’re fuzzy questions. This doesn’t matter if you’re running a competition where the goal is to find a winning forecaster - you can just choose the questions which do have strict resolution criteria and are unambiguous. But most of the time the things companies really care about are fuzzy and they are willing to trade off way more rigour than most forecasters are comfortable with. Let’s imagine a firm who might be interested in opening up a new plant in a country and are worried about political risk. They want to forecast whether the country will be politically stable for the next 20 years, and that’s quite a fuzzy thing to resolve cleanly.

So, what would you do? A prediction market or other forecast platform might try to figure out proxy measures for something like this but it would be hard, and maybe the most useful output would emerge by sourcing qualitative information from forecasters. Maybe soliciting base-rates for different types of risk incidents and other information about potential problems in the target country might be more useful than the actual numerical forecast. If that works are you now doing forecasting, running a prediction platform, or doing something else which we haven’t studied as carefully and which might lead to the introduction of error? It’s almost like the lamp post problem - we search in the light of forecast research because it’s a known set of problems and solutions, but that might represent the easiest set of research findings to produce rather than techniques for producing the most useful information.

4) Suppose I come to you with a forecasting question on a topic that you know next to nothing about - what is your process for researching the question, and is there anything particularly unique that you do that you think other (super)forecasters could learn from?

There is a standard ‘set’ of techniques most forecasters use when encountering something new. They’d first look at the base rate of similar incidents in the past, giving them the “outside view” starting estimate of the probabilities involved. Then they’d look at the specific mechanisms by which an event of this type might occur, considering the time and event scale carefully. They’d review a bunch of different sources to adjust their estimate and finally conduct a “pre mortem”- answering the question “how am I most likely to have been wrong?”.

These are the likely common answers which are a useful staring point but there’s a lot more going on among forecasters than people deploying this process. Many people have their own individual methods and practices which work with the grain of their brains, even down to the level of intuition, but because those things are specific to people, they wouldn’t turn up in a big survey. Having said that, ‘classic’ forecasting techniques are classic because they work.

The most important thing to think about when forecasting is calibration. You might start out using a number of different techniques or even ‘trusting your gut’ and intuition, but if you don’t measure your accuracy and see what causes it to improve, you’re really not going to get better.

I do believe there is a neurodiversity component to this as well. In my experience, one of the reasons that forecasters can get things right is that they tend to be a little less influenced by social information. If you’re not very swayed by social trends or the prevailing views, then you have a big advantage in being able to dispassionately apply some the standard techniques outlined above. I think this accounts at least in part for things appearing a bit more ‘obvious in retrospect’, when they’re stripped of the social and emotional context.

There probably aren’t any techniques that I use that I’ve never heard of anyone else using. One area where I’ve been able to get better results than I otherwise would have done is figure out where there might be weaknesses in the crowd prediction. I’m good at identifying where other people have gone completely wrong. For instance, I might be able to develop a mental model about whether the crowd is likely to overestimate or underestimate the chance that some country is likely to start a war. You can quantify the group-think there: if I think people’s mental models make people likely to think the country is 10% more likely to be belligerent, I can adjust from there.

5) What is the best argument you’ve heard for forecasting being overrated or not particularly useful, and why do you disagree with it (if you do)?

Forecasting might be overrated. Nearly all forecasters are paid more by their day jobs to do something other than forecasting. The market message is “don’t forecast”! Forecasting websites also don’t exactly get a huge amount of traffic, so it isn’t like huge numbers of people are relying on these forecasts to make important decisions yet. If this was immediately valuable to people, they would be looking at it all the time, and they’re not.

In fact the aspect of forecasting the media pays most attention to is coverage of forecasters themselves: look at this unusual person who likes predicting things, rather than what the actual forecasts say.

Our goal at Swift Centre is to change this. It might be an uphill slog but I believe there are things we can do: we can get better at matching our forecasting to the topics people are interested in, we can present ourselves better. One thing that makes a big difference is having a good UX for forecasting sites, way more than people are comfortable acknowledging. We need to get better at explaining ourselves. Probabilities on their own are not that useful for decision-making, but probabilities accompanied by more information might be a game-changer.

During the pandemic, Dominic Cummings said some of the most useful stuff that he received and circulated in the British government was not forecasting, it was qualitative information explaining the general model of what’s going on, which enabled decision-makers to think more clearly about their options for action and the likely consequences. If you’re worried about a new disease outbreak, you don’t just want a percentage probability estimate about future case numbers, you want an explanation of how the virus is likely to spread, what you can do about it, how you can prevent it. Not the best estimate for how many COVID cases there will be in a month, but why forecasters believe there will be X COVID cases in a month. Getting this contextual and decision-relevant information right is my full-time focus at the Swift Centre and I think it’s where we have the best chance of making forecasting as useful as it can be.

Samstack

Discussion about this post