Selection Bias Is A Fact Of Life, Not An Excuse For Rejecting Internet Surveys
Sometimes people do amateur research through online surveys. Then they find interesting things. Then their commenters say it doesn’t count, because “selection bias!” This has been happening to Aella for years, but people try it sometimes on me too. I think these people are operating off some model where amateur surveys necessarily have selection bias, because they only capture the survey-maker’s Twitter followers, or blog readers, or some other weird snapshot of the Internet-using public. But real studies by professional scientists don’t have selection bias, because . . . sorry, I don’t know how their model would end this sentence. The real studies by professional scientists usually use Psych 101 students at the professional scientists’ university. Or sometimes they will put up a flyer on a bulletin board in town, saying “Earn $10 By Participating In A Study!” in which case their population will be selected for people who want $10 (poor people, bored people, etc). Sometimes the scientists will get really into cross-cultural research, and retest their hypothesis on various primitive tribes - in which case their population will be selected for the primitive tribes that don’t murder scientists who try to study them. As far as I know, nobody in history has ever done a psychology study on a truly representative sample of the world population. This is fine. Why? Selection bias is disastrous if you’re trying to do something like a poll or census. That is, if you want to know “What percent of Americans own smartphones?” then any selection at all limits your result. The percent of Psych 101 undergrads who own smartphones is different from the percent of poor people who want $10 who own smartphones, and both are different from the percent of Americans who own smartphones. The same is true about “how many people oppose abortion?” or “what percent of people are color blind?” or anything else trying to find out how common something is in the population. The only good ways to do this are a) use a giant government dataset that literally includes everyone, b) hire a polling company like Gallup which has tried really hard to get a panel that includes the exact right number of Hispanic people and elderly people and homeless people and every other demographic, c) do a lot of statistical adjustments and pray. Selection bias is fine-ish if you’re trying to do something like test a correlation. Does eating bananas make people smarter because something something potassium? Get a bunch of Psych 101 undergrads, test their IQs, and ask them how many bananas they eat per day. If you find that people who eat more bananas have higher IQ, then fine, that’s a finding. If you’re right about the mechanism (something something potassium), then probably it should generalize to groups other than Psych 101 undergrads. It might not! But it’s okay to publish a paper saying “Study Finds Eating Bananas Raises IQ” with a little asterisk at the bottom saying “like every study ever done, we only tested this in a specific population rather than everyone in the world, and for all we know maybe it isn’t true in other populations, whatever.” If there’s some reason why Psych 101 undergrads are a particularly bad population to test this in, and any other population is better, then you should use a different population. Otherwise, choose your poison. Sometimes a correlation will genuinely fail to generalize out of sample. Suppose you find that, in a population of Psych 101 undergrads at a good college, family income is unrelated to obesity. This makes sense; they’re all probably pretty well-off, and they all probably eat at the same college cafeteria. But generalize to the entire US population, and poor people will be more obese, because they can’t afford healthy food / don’t have time to exercise / possible genetic correlations. And then generalize further to the entire world population, and poor people will be thinner, because some of them can’t afford food and are literally starving. And then generalize further to the entire world population over all of human history, and it stops holding again, because most people are cavemen who eat grubs and use shells for money, and having more shells doesn’t make it any easier to find grubs. More often, we’re a little nervous about this but we cross our fingers and hope it works. Antidepressants have never been tested in the population of people named Melinda Hauptmann-Brown. If you’re a depressed person named Melinda Hauptmann-Brown, you will have to trust that the same antidepressants that work on people who aren’t named Melinda Hauptmann-Brown also work on you. Luckily the mechanism of antidepressants (something something serotonin, or maybe not) seems like the kind of thing that should work regardless of what your name is, so this is a good bet. But it’s still a bet. Selection bias is fatal for polls, but only sometimes a problem for correlations. In real life, worrying about selection bias for correlations looks like thinking really hard about the mechanism, formulating hypotheses about how you expect something to generalize to particular out-of-sample populations, sometimes trying to test those hypotheses, but accepting that you can never test all of them and will have to take a lot of things on priors. It doesn’t look like saying “This is an Internet survey, so it has selection bias, unlike real-life studies, which are fine.” Come on! You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Older messages
Open Thread 256
Monday, December 26, 2022
...
Fact Check: Do All Healthy People Have Mystical Experiences?
Friday, December 23, 2022
...
The Media Very Rarely Lies
Thursday, December 22, 2022
"With a title like that, obviously I will be making a nitpicky technical point."
Prediction Market FAQ
Tuesday, December 20, 2022
...
Open Thread 255
Monday, December 19, 2022
...
You Might Also Like
How to Keep Providing Gender-Affirming Care Despite Anti-Trans Attacks
Sunday, March 9, 2025
Using lessons learned defending abortion, some providers are digging in to serve their trans patients despite legal attacks. Most Read Columbia Bent Over Backward to Appease Right-Wing, Pro-Israel
Guest Newsletter: Five Books
Sunday, March 9, 2025
Five Books features in-depth author interviews recommending five books on a theme Guest Newsletter: Five Books By Sylvia Bishop • 9 Mar 2025 View in browser View in browser Five Books features in-depth
GeekWire's Most-Read Stories of the Week
Sunday, March 9, 2025
Catch up on the top tech stories from this past week. Here are the headlines that people have been reading on GeekWire. ADVERTISEMENT GeekWire SPONSOR MESSAGE: Revisit defining moments, explore new
10 Things That Delighted Us Last Week: From Seafoam-Green Tights to June Squibb’s Laundry Basket
Sunday, March 9, 2025
Plus: Half off CosRx's Snail Mucin Essence (today only!) The Strategist Logo Every product is independently selected by editors. If you buy something through our links, New York may earn an
🥣 Cereal Of The Damned 😈
Sunday, March 9, 2025
Wall Street corrupts an affordable housing program, hopeful parents lose embryos, dangers lurk in your pantry, and more from The Lever this week. 🥣 Cereal Of The Damned 😈 By The Lever • 9 Mar 2025 View
The Sunday — March 9
Sunday, March 9, 2025
This is the Tangle Sunday Edition, a brief roundup of our independent politics coverage plus some extra features for your Sunday morning reading. What the right is doodling. Steve Kelley | Creators
☕ Chance of clouds
Sunday, March 9, 2025
What is the future of weather forecasting? March 09, 2025 View Online | Sign Up | Shop Morning Brew Presented By Fatty15 Takashi Aoyama/Getty Images BROWSING Classifieds banner image The wackiest
Federal Leakers, Egg Investigations, and the Toughest Tongue Twister
Sunday, March 9, 2025
Homeland Security Secretary Kristi Noem said Friday that DHS has identified two “criminal leakers” within its ranks and will refer them to the Department of Justice for felony prosecutions. ͏ ͏ ͏
Strategic Bitcoin Reserve And Digital Asset Stockpile | White House Crypto Summit
Saturday, March 8, 2025
Trump's new executive order mandates a comprehensive accounting of federal digital asset holdings. Forbes START INVESTING • Newsletters • MyForbes Presented by Nina Bambysheva Staff Writer, Forbes
Researchers rally for science in Seattle | Rad Power Bikes CEO departs
Saturday, March 8, 2025
What Alexa+ means for Amazon and its users ADVERTISEMENT GeekWire SPONSOR MESSAGE: Revisit defining moments, explore new challenges, and get a glimpse into what lies ahead for one of the world's