Selection Bias Is A Fact Of Life, Not An Excuse For Rejecting Internet Surveys
Sometimes people do amateur research through online surveys. Then they find interesting things. Then their commenters say it doesn’t count, because “selection bias!” This has been happening to Aella for years, but people try it sometimes on me too. I think these people are operating off some model where amateur surveys necessarily have selection bias, because they only capture the survey-maker’s Twitter followers, or blog readers, or some other weird snapshot of the Internet-using public. But real studies by professional scientists don’t have selection bias, because . . . sorry, I don’t know how their model would end this sentence. The real studies by professional scientists usually use Psych 101 students at the professional scientists’ university. Or sometimes they will put up a flyer on a bulletin board in town, saying “Earn $10 By Participating In A Study!” in which case their population will be selected for people who want $10 (poor people, bored people, etc). Sometimes the scientists will get really into cross-cultural research, and retest their hypothesis on various primitive tribes - in which case their population will be selected for the primitive tribes that don’t murder scientists who try to study them. As far as I know, nobody in history has ever done a psychology study on a truly representative sample of the world population. This is fine. Why? Selection bias is disastrous if you’re trying to do something like a poll or census. That is, if you want to know “What percent of Americans own smartphones?” then any selection at all limits your result. The percent of Psych 101 undergrads who own smartphones is different from the percent of poor people who want $10 who own smartphones, and both are different from the percent of Americans who own smartphones. The same is true about “how many people oppose abortion?” or “what percent of people are color blind?” or anything else trying to find out how common something is in the population. The only good ways to do this are a) use a giant government dataset that literally includes everyone, b) hire a polling company like Gallup which has tried really hard to get a panel that includes the exact right number of Hispanic people and elderly people and homeless people and every other demographic, c) do a lot of statistical adjustments and pray. Selection bias is fine-ish if you’re trying to do something like test a correlation. Does eating bananas make people smarter because something something potassium? Get a bunch of Psych 101 undergrads, test their IQs, and ask them how many bananas they eat per day. If you find that people who eat more bananas have higher IQ, then fine, that’s a finding. If you’re right about the mechanism (something something potassium), then probably it should generalize to groups other than Psych 101 undergrads. It might not! But it’s okay to publish a paper saying “Study Finds Eating Bananas Raises IQ” with a little asterisk at the bottom saying “like every study ever done, we only tested this in a specific population rather than everyone in the world, and for all we know maybe it isn’t true in other populations, whatever.” If there’s some reason why Psych 101 undergrads are a particularly bad population to test this in, and any other population is better, then you should use a different population. Otherwise, choose your poison. Sometimes a correlation will genuinely fail to generalize out of sample. Suppose you find that, in a population of Psych 101 undergrads at a good college, family income is unrelated to obesity. This makes sense; they’re all probably pretty well-off, and they all probably eat at the same college cafeteria. But generalize to the entire US population, and poor people will be more obese, because they can’t afford healthy food / don’t have time to exercise / possible genetic correlations. And then generalize further to the entire world population, and poor people will be thinner, because some of them can’t afford food and are literally starving. And then generalize further to the entire world population over all of human history, and it stops holding again, because most people are cavemen who eat grubs and use shells for money, and having more shells doesn’t make it any easier to find grubs. More often, we’re a little nervous about this but we cross our fingers and hope it works. Antidepressants have never been tested in the population of people named Melinda Hauptmann-Brown. If you’re a depressed person named Melinda Hauptmann-Brown, you will have to trust that the same antidepressants that work on people who aren’t named Melinda Hauptmann-Brown also work on you. Luckily the mechanism of antidepressants (something something serotonin, or maybe not) seems like the kind of thing that should work regardless of what your name is, so this is a good bet. But it’s still a bet. Selection bias is fatal for polls, but only sometimes a problem for correlations. In real life, worrying about selection bias for correlations looks like thinking really hard about the mechanism, formulating hypotheses about how you expect something to generalize to particular out-of-sample populations, sometimes trying to test those hypotheses, but accepting that you can never test all of them and will have to take a lot of things on priors. It doesn’t look like saying “This is an Internet survey, so it has selection bias, unlike real-life studies, which are fine.” Come on! You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Older messages
Open Thread 256
Monday, December 26, 2022
...
Fact Check: Do All Healthy People Have Mystical Experiences?
Friday, December 23, 2022
...
The Media Very Rarely Lies
Thursday, December 22, 2022
"With a title like that, obviously I will be making a nitpicky technical point."
Prediction Market FAQ
Tuesday, December 20, 2022
...
Open Thread 255
Monday, December 19, 2022
...
You Might Also Like
Wednesday Briefing: Israel approves Hezbollah cease-fire deal
Tuesday, November 26, 2024
Plus, Mexico reacts to Trump's tariff threats. View in browser|nytimes.com Ad Morning Briefing: Asia Pacific Edition November 27, 2024 Author Headshot By Gaya Gupta Good morning. We're covering
Amazon’s climate impacts draw employee concern in new survey
Tuesday, November 26, 2024
Stoke Space CEO's reusable spaceship dream | New app helps parents of young kids network ADVERTISEMENT GeekWire SPONSOR MESSAGE: Get your ticket for AWS re:Invent, happening Dec. 2–6 in Las Vegas:
Sending gratitude and thanks
Tuesday, November 26, 2024
The Conversation community keeps us going
☕ You’re gonna be popular
Tuesday, November 26, 2024
“Wicked” and the era of over-the-top brand collaborations. November 26, 2024 Marketing Brew Sponsored by American Express It's Tuesday. Bush's Beans, the canned-bean-slash-merchandise company,
☕ A warehouse divided
Tuesday, November 26, 2024
Trends changing the warehouse space. November 26, 2024 Retail Brew Presented By Passport It's Tuesday, and Starbucks employees are using pen and paper to track their hours following a cyberattack
Trump's controversial pick for Labor secretary.
Tuesday, November 26, 2024
Lori Chavez-DeRemer drew criticism from the right and left. Plus, looking back on a note to self. Trump's controversial pick for Labor secretary. Lori Chavez-DeRemer drew criticism from the right
GeekWire Gala: Early-bird pricing ends Saturday, Nov. 30!
Tuesday, November 26, 2024
The GeekWire Gala returns December 12: Grab early-bird tickets today! View this email in your browser Grab GeekWire Gala tickets today, early-bird rates end on Saturday! The GeekWire Gala kicks off the
Turing Societies
Tuesday, November 26, 2024
The AI Turing Test // Were All Slave Societies Brutal? Turing Societies By Kaamya Sharma • 26 Nov 2024 View in browser View in browser The AI Turing Test Scott Alexander | Astral Codex Ten | 20th
⚡️ ‘Andor’ S2 Is Making One Crucial Change
Tuesday, November 26, 2024
Plus: Marvel is getting its next-generation superhero team all wrong. Inverse Daily Two men ride a high-speed vehicle in a gritty, industrial setting. One appears focused, while the other looks
The Intercept needs to raise $225,000 by midnight, December 3
Tuesday, November 26, 2024
We're not about to let Erik Prince, Elon Musk, or any other litigious billionaire dictate what we cover. But we rely on member donations to help make everything we do possible. In 2020, The