Astral Codex Ten - Mantic Monday 3/14/22
Ukraine WarcastingChanges in Ukraine prediction markets since my last post February 28:
If you like getting your news in this format, subscribe to the Metaculus Alert bot for more (and thanks to ACX Grants winner Nikos Bosse for creating it!) Numbers 1 and 7 are impressive changes! (it’s interesting how similarly they’ve evolved, even though they’re superficially about different things and the questions were on different prediction markets). Early in the war, prediction markets didn’t like Ukraine’s odds; now they’re much more sanguine. Let’s look at the exact course: This is almost monotonically decreasing. Every day it’s lower than the day before. How suspicious should we be of this? If there were a stock that decreased every day for twenty days, we’d be surprised that investors were constantly overestimating it. At some point on day 10, someone should think “looks like this keeps declining, maybe I should short it”, and that would halt its decline. In efficient markets, there should never be predictable patterns! So what’s going on here? My guess is it’s a technical issue with Metaculus. Suppose that at the beginning of the war, people thought there was an 80% chance of occupation. Lots of people predicted 80%. Then events immediately showed the real probability was more like 10%. Each day a couple more people showed up and predicted 10%, which gradually moved the average of all predictions (old and new) down. You can see a description of their updating function here - it seems slightly savvier than the toy version I just described, but not savvy enough to avoid the problem entirely. But Polymarket has the same problem: It shouldn’t be able to have technical issues like Metaculus, so what’s up? One possibility is that, by a crazy coincidence, every day some new independent event happened that thwarted Russia and made Ukraine’s chances look better. Twenty dice rolls in a row came up natural 20s for Ukraine. Seems unlikely. Another possibility is that forecasters started out thinking that Russia was strong, in fact Russia was weak, and every day we’ve gathered slightly more evidence for that underlying reality. I’m having trouble figuring out of if this makes sense. You’d still think that after ten straight days of that, people should say “probably tomorrow we’ll get even more evidence of the same underlying reality, might as well update today”. A third possibility is that forecasters are biased against updating. A perfect Bayesian, seeing the failures of the Russian advance over the first few days, would have immediately updated to something like correct beliefs. But the forecasters here were too conservative and didn’t do that. A fourth possibility is that forecasters biased towards updating too much. Ukrainian propaganda is so good that every extra day you’re exposed to it, you become more and more convinced that Ukraine is winning. I find this pattern really suspicious but I’m not sure exactly how to think about it. Nuclear WarcastingA friend recently invited me to their bolthole in the empty part of Northern California. Their argument was: as long as the US and Russia are staring menacingly at each other, there’s a (slight) risk of nuclear war. Maybe we should get out of cities now, and beat the traffic jam when the s#!t finally hits the fan. I declined their generous offer, but I’ve been wondering whether I made the right call. What exactly is the risk of nuclear war these next few months? Enter Samotsvety Forecasts. This is a team of some of the best superforecasters in the world. They won the CSET-Foretell forecasting competition by an absolutely obscene margin, “around twice as good as the next-best team in terms of the relative Brier score”. If the point of forecasting tournaments is to figure out who you can trust, the science has spoken, and the answer is “these guys”. As a service to the community, they came up with a formal forecast for the risk of near-term nuclear war:
You can read more about their methodology and reasoning on the post on Effective Altruism Forum, but I found this table helpful: Along with reassuring me I made the right choice not to run and hide, this is a new landmark in translating forecasting results to the real world. The whole stack of technologies came together: tournaments to determine who the best predictors are, methods for aggregating probabilities, and a real-world question that lots of people care about. Thanks to Samotsvety and their friends for making this happen! Forecasters Vs. ExpertsAlso from the EA Forum this month: Comparing Top Forecasters And Domain Experts, by Arb Consulting (the team also includes one of the Samotsvety members who worked on the nuclear risk estimate). Everyone always tells the story of how Tetlock’s superforecasters beat CIA experts. Is it true? Arb finds that it’s more complicated:
If I’m understanding this right, the average forecaster did worse than the average expert, but Tetlock had the bright idea to use clever aggregation methods for his superforecasters, and the CIA didn’t use clever aggregation methods for their experts. The CIA did try a prediction market, which in theory and under ideal conditions should work at least as well as any other aggregation method, but under real conditions (it was low-volume and poorly-designed) it did not. They go on to review thirteen other studies in a variety of domains (keep in mind that different fields may have different definitions of “expert” and require different levels of expertise to master). Overall there was no clear picture. Eyeballing the results, it looks like forecasters often do a bit better than experts, but with lots of caveats and possible exculpatory factors. Sometimes the results seemed a little silly: in one, forecasters won because the experts didn’t bother to update their forecasts often enough as things changed; in another, “1st place went to one of the very few public-health professionals who was also a skilled Hypermind forecaster.” They conclude:
So, kind of weak conclusion, but you can probably believe some vague thing like “forecasters seem around as good as experts in some cases”. Also, keep in mind that in real life almost no one ever tries to aggregate experts in any meaningful way. Real-life comparisons tend to be more like “aggregated forecasters vs. this one expert I heard about one time on the news”. I’d go with the forecasters in a situation like this - but again, the studies are too weak to be sure! Shorts1: Taosumer reviews my Prediction Market Cube and asks why I don’t have “decentralized” on there as a desideratum. My answer: decentralization is great, but for me it cashes out in “ease of use” - specifically, it’s easy to use it because the government hasn’t shut it down or banned you. Or as “real money” - the reason Manifold isn’t real-money is because they’re centralized and therefore vulnerable and therefore need to obey laws. Or as “easy to create market” - the reason Kalshi doesn’t let you create markets is partly because it’s centralized and therefore vulnerable and therefore needs to limit markets to things regulators like. I agree that, because of those second order effects, decentralization is crucial and needs to be pursued more, and I agree that it’s a tragedy that [whatever happened to Augur] happened to Augur. 2: More people make Ukraine predictions: Maxim Lott, Richard Hanania (again), Samo Burja (again), EHarding (possibly trolling?), Robin Hanson (sort of) 3: Last month we talked about some problems with the Metaculus leaderboard. An alert reader told me about their alternative Points Per Question leaderboard, which is pretty good - although I think different kinds of questions give different average amounts of points so it’s still not perfect. 4: Also last month, I suggested Manifold Markets have a loan feature to help boost investment in long-term markets. They’ve since added this feature: your first $M20 will automatically be a zero-interest loan. 5: Related: I’m testing Manifold as a knowledge-generation device. If you want to help, go bet in the market about how I’ll rank interventions in an upcoming updated version of the Biodeterminists’ Guide To Pregnancy. 6: Reality Cards is a new site that combines the legal hassles of prediction markets with the technical hassles of NFTs. You bid to “rent” the NFT representing a certain outcome, your rent goes into a pot, and then when the event happens the pot goes to whoever held the relevant NFT. I’m not math-y enough to figure out whether this is a proper scoring rule or not, but it sure does sound unnecessarily complicated. I imagine everyone involved will be multimillionaires within a week. 7: In case a prediction market using NFTs isn’t enough for you, this article suggests that OpenDAO is working on a prediction market about NFTs. It claims they should be done by January, but I can’t find it. You’re a free subscriber to Astral Codex Ten. For the full experience, become a paid subscriber. |
Older messages
Open Thread 215
Sunday, March 13, 2022
...
Progesterone Megadoses Might Be A Cheap Zulresso Substitute
Thursday, March 10, 2022
...
Advice For Unwoke Academic?
Thursday, March 10, 2022
...
Zounds! It's Zulresso and Zuranolone!
Tuesday, March 8, 2022
How excited should we be about the latest class of antidepressants?
Open Thread 214
Monday, March 7, 2022
...
You Might Also Like
The best 200+ early Black Friday deals
Thursday, November 28, 2024
For your post-dinner scroll View in browser Ad The Recommendation Ad Gobble gobble deals deals Levi's Premium Wedgie Straight Fit Women's Jeans, Apple iPad Air (M2), and Saucony Guide 17. NYT
Friday Briefing: Day 2 of Lebanon’s cease-fire
Thursday, November 28, 2024
Plus, the gangs stealing hiring exams in India. View in browser|nytimes.com Ad Morning Briefing: Asia Pacific Edition November 29, 2024 Author Headshot By Gaya Gupta Good morning. We're covering
Silicon Aristotle
Thursday, November 28, 2024
Who Can Claim Aristotle? // Private Chefs For Silicon Valley's Elite Silicon Aristotle By Caroline Crampton • 28 Nov 2024 View in browser View in browser Who Can Claim Aristotle? Edith Hall | Aeon
How the Pilgrims differed from the Puritans
Thursday, November 28, 2024
+ how to avoid awkwardness at Thanksgiving table
♻️ Gratitude & Joy flow in a cycle
Thursday, November 28, 2024
Fun stuff for you to click on curated with joy by CreativeMornings HQ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
How “Y.O.L.O. Joe” Can Beat The Lame Duck
Thursday, November 28, 2024
Here is what Democrats could actually achieve in the months before Trump takes office. Need a productive political topic to discuss at the Thanksgiving table? Want to impart key facts as you pass the
Trump Cabinet Bomb Threats, Ancient Sandwiches, and a Popsicle Caper
Thursday, November 28, 2024
Several of President-elect Donald Trump's Cabinet nominees and administration appointees faced bomb threats and "swatting" attacks on Tuesday and Wednesday. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
First-ever UEFI bootkit for Linux in the works, experts say [Thu Nov 28 2024]
Thursday, November 28, 2024
Hi The Register Subscriber | Log in The Register Daily Headlines 28 November 2024 KITTY LOOKS AT SCREEN AI GENERATED First-ever UEFI bootkit for Linux in the works, experts say Bootkitty doesn't
On My Mind: Fig Ornaments and Striped Bath Mats
Thursday, November 28, 2024
Plus: Eensy-weensy, teeny-tiny gifts. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission. November 27,
What It’s Like to Be on Trump’s Enemies List
Wednesday, November 27, 2024
Columns and commentary on news, politics, business, and technology from the Intelligencer team. Intelligencer power What It's Like to Be on Trump's Enemies List “Revenge does take time.” Photo-