Selection Bias Is A Fact Of Life, Not An Excuse For Rejecting Internet Surveys
Sometimes people do amateur research through online surveys. Then they find interesting things. Then their commenters say it doesn’t count, because “selection bias!” This has been happening to Aella for years, but people try it sometimes on me too. I think these people are operating off some model where amateur surveys necessarily have selection bias, because they only capture the survey-maker’s Twitter followers, or blog readers, or some other weird snapshot of the Internet-using public. But real studies by professional scientists don’t have selection bias, because . . . sorry, I don’t know how their model would end this sentence. The real studies by professional scientists usually use Psych 101 students at the professional scientists’ university. Or sometimes they will put up a flyer on a bulletin board in town, saying “Earn $10 By Participating In A Study!” in which case their population will be selected for people who want $10 (poor people, bored people, etc). Sometimes the scientists will get really into cross-cultural research, and retest their hypothesis on various primitive tribes - in which case their population will be selected for the primitive tribes that don’t murder scientists who try to study them. As far as I know, nobody in history has ever done a psychology study on a truly representative sample of the world population. This is fine. Why? Selection bias is disastrous if you’re trying to do something like a poll or census. That is, if you want to know “What percent of Americans own smartphones?” then any selection at all limits your result. The percent of Psych 101 undergrads who own smartphones is different from the percent of poor people who want $10 who own smartphones, and both are different from the percent of Americans who own smartphones. The same is true about “how many people oppose abortion?” or “what percent of people are color blind?” or anything else trying to find out how common something is in the population. The only good ways to do this are a) use a giant government dataset that literally includes everyone, b) hire a polling company like Gallup which has tried really hard to get a panel that includes the exact right number of Hispanic people and elderly people and homeless people and every other demographic, c) do a lot of statistical adjustments and pray. Selection bias is fine-ish if you’re trying to do something like test a correlation. Does eating bananas make people smarter because something something potassium? Get a bunch of Psych 101 undergrads, test their IQs, and ask them how many bananas they eat per day. If you find that people who eat more bananas have higher IQ, then fine, that’s a finding. If you’re right about the mechanism (something something potassium), then probably it should generalize to groups other than Psych 101 undergrads. It might not! But it’s okay to publish a paper saying “Study Finds Eating Bananas Raises IQ” with a little asterisk at the bottom saying “like every study ever done, we only tested this in a specific population rather than everyone in the world, and for all we know maybe it isn’t true in other populations, whatever.” If there’s some reason why Psych 101 undergrads are a particularly bad population to test this in, and any other population is better, then you should use a different population. Otherwise, choose your poison. Sometimes a correlation will genuinely fail to generalize out of sample. Suppose you find that, in a population of Psych 101 undergrads at a good college, family income is unrelated to obesity. This makes sense; they’re all probably pretty well-off, and they all probably eat at the same college cafeteria. But generalize to the entire US population, and poor people will be more obese, because they can’t afford healthy food / don’t have time to exercise / possible genetic correlations. And then generalize further to the entire world population, and poor people will be thinner, because some of them can’t afford food and are literally starving. And then generalize further to the entire world population over all of human history, and it stops holding again, because most people are cavemen who eat grubs and use shells for money, and having more shells doesn’t make it any easier to find grubs. More often, we’re a little nervous about this but we cross our fingers and hope it works. Antidepressants have never been tested in the population of people named Melinda Hauptmann-Brown. If you’re a depressed person named Melinda Hauptmann-Brown, you will have to trust that the same antidepressants that work on people who aren’t named Melinda Hauptmann-Brown also work on you. Luckily the mechanism of antidepressants (something something serotonin, or maybe not) seems like the kind of thing that should work regardless of what your name is, so this is a good bet. But it’s still a bet. Selection bias is fatal for polls, but only sometimes a problem for correlations. In real life, worrying about selection bias for correlations looks like thinking really hard about the mechanism, formulating hypotheses about how you expect something to generalize to particular out-of-sample populations, sometimes trying to test those hypotheses, but accepting that you can never test all of them and will have to take a lot of things on priors. It doesn’t look like saying “This is an Internet survey, so it has selection bias, unlike real-life studies, which are fine.” Come on! You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Key phrases
Older messages
Open Thread 256
Monday, December 26, 2022
...
Fact Check: Do All Healthy People Have Mystical Experiences?
Friday, December 23, 2022
...
The Media Very Rarely Lies
Thursday, December 22, 2022
"With a title like that, obviously I will be making a nitpicky technical point."
Prediction Market FAQ
Tuesday, December 20, 2022
...
Open Thread 255
Monday, December 19, 2022
...
You Might Also Like
An urgent message from The Intercept’s interim editor-in-chief
Tuesday, April 30, 2024
Amid a desolate landscape for journalism, it can be hard to see signs of hope. The Intercept is a rare example. Yesterday was my first day on the job as The Intercept's interim editor-in-chief. I
Numlock News: April 30, 2024 • Kansai, Domino's, Dodecahedrons
Tuesday, April 30, 2024
By Walt Hickey Domino's Domino's sells about 1.5 million pizzas every day, and it's become more and more profitable owing to a number of deals struck with third-party delivery companies and
Response to Hanson On Health Care
Tuesday, April 30, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
☕️ Long-distance work relationships
Tuesday, April 30, 2024
Elon Musk got a big win in China... April 30, 2024 View Online | Sign Up | Shop Morning Brew PRESENTED BY EnergyX Good morning. No one is above making fourth-grade jokes—not even the NY Mets. After 97-
Python, Flutter teams latest on the Google chopping block [Tue Apr 30 2024]
Tuesday, April 30, 2024
Hi The Register Subscriber | Log in The Register {* Daily Headlines *} 30 April 2024 Python, Flutter teams latest on the Google chopping block Python, Flutter teams latest on the Google chopping block
Give Her Cheese
Tuesday, April 30, 2024
Gouda for Mom from Murray's Cheese. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission. Gouda for
Blinken and broker, over and over
Monday, April 29, 2024
There are tentative new green shoots of hope for a ceasefire deal. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Layoffs, cutbacks, and shutdowns
Monday, April 29, 2024
The Intercept is taking steps to position ourselves for the long term while navigating the industrywide crisis. Readers of The Intercept are a pretty well-informed bunch, so you've no doubt heard
You Don't Want To Miss This
Monday, April 29, 2024
This advisory has returned a massive 838% since inception compared to the 273% return from the S&P 500 Deadline Extended 24 Hours Only Fellow Investor, Just recently, we swung open the doors to our
University Protests: The Latest at Colleges Beyond Columbia
Monday, April 29, 2024
Columns and commentary on news, politics, business, and technology from the Intelligencer team. Intelligencer on campus University Protests: The Latest at Colleges Beyond Columbia Police have arrested