Notes on forecasting strategy

Nov 05, 2025

[Note: You need to be familiar with forecasting for this to be of any interest. I may have made some mistakes given my mediocre mathematical ability and because I wrote this very quickly.]

Imagine that there are a few people forecasting on a question, and that the question has three possible ways of resolving. To make things vivid, the question could be “Who will win the NYC Mayoral Election?”. Option A is “Zohran Mamdani”, Option B is “Andrew Cuomo”, and Option C is “Curtis Sliwa (or any other candidate)”. We look at the distribution of forecasts, and there are currently three people forecasting on the question: one has the username ‘dontmesswiththeZohan97’, and gives a 100% probability that Zohran will win the election. The second is an avid Cuomo fan who puts 100% on Cuomo. The third, who just hates the other two candidates, does the same for Mr. Sliwa (or any other candidate).

Imagine looking at this market, and deciding what you should do. But here’s a twist: we’re in a parallel universe where you haven’t been following the election at all, and you have no idea who will win, or even who the candidates are. What should you do?

A simple answer is that if you give a probability of 1/3rd for each option, under standard Brier scoring rules for questions with multiple possible answers, you’ll do much better than the median forecaster. Let’s say that Curtis pulls off a huge upset, and wins the election. The Zohran fan gets a Brier score of 2 (the worst possible score), the Cuomo fan gets a Brier score of 2, the ‘I hate the other candidates’ guy gets a perfect score of 0. You get a score of 0.666, leaving you beating the median of 1.333 by a margin comfortable enough to make you a Superforecaster if you’re able to find more questions with bad forecasters1, without even knowing anything about the question you’re answering.

This example is very contrived. You probably won’t find a question where each forecasting is predicting 100% on different options. But I think this gets at something that bleeds more into how people become Superforecasters than you might realise.

For one thing, my impression is that you can get a very good score by just copying the crowd forecast directly. This might sound odd; if you copy the crowd forecast, wouldn’t you do exactly as well as the median forecaster? No, you wouldn’t, because the aggregate forecast does much better than the median forecaster. Taking our example above, you’ve given a forecast exactly in line with the mean forecast of all the other forecasters, but trounced the median forecast.

As far as I know, it’s not particularly uncommon for groups of forecasters to find questions where the crowd appears to be particularly stupid (or slow to update), so they can clean up without doing as much work or research as many would expect top forecasters do. Question selection is much more important than many people realise in achieving Superforecaster status.

Another point I would make is that proficient forecasters often do things in very different ways than people who are vaguely interested in forecasting might imagine. Ezra Karger, who did exceptionally well in the Astral Codex Ten forecasting tournament two years in a row, wrote about his strategy to win the tournament:

I began by collecting data from Manifold Markets for these questions. I then compared those forecasts to the forecasts of superforecasters in the blind data, subset to those who had given forecasts on the S&P500 and Bitcoin questions that were reasonably consistent with the efficiency of markets; I subset to those who forecasted between 30% and 80% for the probability that the S&P500 and Bitcoin would increase during 2023, which were the only reasonable predictions by the time blind mode ended in mid-January. I then used my own judgment to tweak forecasts where I strongly disagreed with the prediction markets and the superforecasters (for example, I was more than 15 percentage points away from the average of Manifold Markets and the efficient-market-believing superforecasters on questions 17, 19, 21, 30, 34, and 50). I paid especially close attention to questions where late-breaking news made the superforecasters’ forecasts less relevant (and I downweighted their forecasts on those questions accordingly).

This is not a process of ‘figure out the base rate, do a load of research into the topic to try to work out how much you should deviate from that base rate, and maybe integrate the views of some other people who have done similar research to smooth out the noise’. The bulk of the process is figuring out the best way to make use of other peoples’ forecasts, and only then do you use your own judgment to make adjustments. And even then, there’s no real reason you need to use your own judgment to make adjustments, I’m fairly certain the other forecasts are sufficient to do very well indeed.

I used to work with Ezra, and have a memory of a time we were playing a game with other colleagues that involved finding something you had in common with someone you were paired up in 60 seconds (e.g. ‘Joe and I both lived in New Zealand for the same 6 months in 2018!’). Then, everyone would rank each pair by how unique their commonality was, and then use some scoring system to determine who was able to find the most interesting commonality in those 60 seconds.

While we were doing the ranking, I asked the group whether we were allowed to be strategic in our rankings. My thought was that I could give myself (and my partner) the #1 spot, and then give the number #2 spot to the team I thought was most likely to lose, and the last place spot to the team I thought was most likely to win, but I was a little worried about people frowning upon this sort of strategising. Ezra immediately retorted something along the lines of ‘Sam, don’t give other people the obvious winning idea!’. My memory is that Ezra had used roughly this strategy, as did I, but nobody else did. I’m not sure, but I don’t think it was a coincidence that Ezra and I were the only Superforecasters in the room - this sort of thinking is a big part of what earns you Superforecaster status.

When I got into forecasting, it was during the pandemic, I was doing my master’s degree, and I had a lot of time on my hands. Getting superforecaster status began to feel like a video game that I could beat, using strategies akin to learning the shortcuts on a Mario Kart track. If you’ve read Superforecasting, this may be something of a surprise, and I think it’s true that there are many Superforecasters who do not strategise a huge amount. But talking to Superforecasters, it becomes apparent that many of them are keenly aware of how to choose the right questions to forecast on, the extent to which they should follow the median forecast, and generally have a lot of knowledge about the forecasting meta (for example, did you know that only the most recent 40% of forecasts go into the median forecast displayed on Good Judgment Open, and that a clever forecaster can use that to his or her advantage?).

Superforecasters I have met generally are much better at the “classic” style forecasting that you probably read about in Superforecasting. They’re able to use base rates, they’re capable researchers, and they’re able to integrate that research into their forecasts competently. But forecasting strategy also plays a role in the performance of many proficient forecasters, and not many people talk about it.

In fact, my quick calculation suggests that you wouldn’t just be a decent Superforecaster if you were able to consistently find these sorts of opportunities, you may well have the greatest forecasting performance of all time! A quick look through the Samotsvety track records shows that the difference between you and the crowd would be far, far greater than the difference between the best Samotsvety forecaster and the crowd.

Astrid the Rationalist Witch

Nov 11

My main takeaway from this is the career highlight of seeing one of my icebreakers mentioned in a Substack post! 😎

Am I right in thinking there's a scoring system that tries to prevent this by scoring individual forecasts based not only on ground truth, but on how much value they add to the aggregate? So if you always went with the crowd, you'd do poorly on that metric.

Expand full comment

1 reply by Sam Atis

James Özden

Nov 5

For the passage below, don't you mean the mean is 1.333? As the median should be 2 given the set of scores will be [0,2,2]:

"A simple answer is that if you give a probability of 1/3rd for each option, under standard Brier scoring rules for questions with multiple possible answers, you’ll do much better than the median forecaster. Let’s say that Curtis pulls off a huge upset, and wins the election. The Zohran fan gets a Brier score of 2 (the worst possible score), the Cuomo fan gets a Brier score of 2, the ‘I hate the other candidates’ guy gets a perfect score of 0. You get a score of 0.666, leaving you beating the median of 1.333 by a margin comfortable enough to make you a Superforecaster if you’re able to find more questions with bad forecasters1, without even knowing anything about the question you’re answering."

2 replies by Sam Atis and others

7 more comments...

Samstack

Discussion about this post

Ready for more?