Highlights From The Comments On Motivated Reasoning And Reinforcement Learning
I. Comments From People Who Actually Know What They’re Talking About Gabriel writes:
That comment by Fox is:
I agree that if we were perfect Bayesian reasoners, the knowledge that there was now a 5% chance of there being a lion would propagate throughout all brain regions and they could condition on this immediately. And yet a few days ago, I (on a diet) visited some friends who sometimes leave delicious brownies on their counter. I worried that if I saw the brownies, I would eat them, so I tried not to look at the counter. But part of me felt bad that I was passing up the opportunity to eat delicious brownies, so my split-second reaction as I walked through their kitchen was to compromise by looking towards the edge of their counter to check for brownies, but to deliberately exclude from my vision the part of the counter where the brownies were most likely to be. This makes me think that the parts of my brain doing active inference are not quite perfect Bayesians making perfect updates.
But check out the rest of the comment subthread for some pushback against and clarification of this model. II. Arguments That The Long-Term Rewards Of Spotting The Lion Outweigh The Short-Term Drawbacks Here are three comments that I think say about the same thing from different angles. Phil:
KJZ:
Mike:
I feel like we can thought-experiment our way out of this. Suppose I invest in Bitcoin, then check its price every day. There is a little up arrow or down arrow next to some number and percent. Some days it’s a green up arrow and I feel good and smart and rich. Other days it’s a red down arrow and I feel bad and dumb and poor. None of this ever gets confirmed by any kind of ground truth, because I am HODLing and will never sell my Bitcoins until I retire. So how come I don’t start hallucinating that the arrow is green and points up? Every time I’ve “taken the action” of seeing a green upward-pointing arrow, I’ve felt better; every time I’ve taken the opposite action, I’ve felt worse! You can no longer appeal to the “the ultimate reinforcement is whether you got mauled by a lion or not”, because I’ve never sold my Bitcoin and gotten any form of reinforcement more final than checking the arrow (if you want, imagine that I get hit by a truck at age 64 and never sell the Bitcoin). I don’t want to say “epistemics are protected from reinforcement learning” is the only way out of this. It could be that the visual cortex gets reinforced at the level of broad principles, and any change that caused you to flip the direction and color of the arrow would have to change really fundamental things that would make your vision worse in other ways. But it doesn’t seem like “ultimate reinforcement” is what’s preventing this from happening, since there is none. Also, behavioral reinforcement learning is nowhere near this good. You might think that the short-term reward of eating brownies wouldn’t change behavior because the real reward we should be considering is the reward of being healthy and looking good. But this works very inconsistently, as opposed to the “see lions as lions” thing which works all the time. III. Am I Ignoring The Many Practical Reasons For People To Have Motivated Reasoning?
I commented:
XPYM asks:
This answers the “why” question but not the “how” question. If you wonder why animals can see, the answer is “it’s useful for spotting food and predators and stuff”. If you wonder how animals can see, the answer is a giant ophthalmology textbook and lots of stuff about rods and cones. One of the ideas that’s had the biggest effect on me recently is thinking about how small the genome is and how poorly it connects to the brain. It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”. But in order to do that, you need some specific thing that happens in the genome - an adenine switched to a guanine, or something - to give people a desire to suck up to high-status leaders. Some change in the conformation of a protein has to change the wiring of the brain in some way such that people feel like sucking up to high-status leaders is a good idea. This isn’t impossible - evolution has managed weirder things - but it’s so, so hard. Humans have like 20,000 genes. Each one codes for a protein. Most of those proteins do really basic things like determine how flexible the membrane of a kidney cell should be. You can’t just have the “how you behave towards high status leaders” protein shift into the “suck up to them” conformation, that’s not how proteins work! You should penalize theories really heavily for every piece of information that has to travel from the genome to the brain. It certainly should be true that people try to spin things in self-serving ways: this is Trivers’ theory of self-deception and consciousness as public relations agent. But that requires communicating an entire new philosophy of information processing from genome to brain. Unless you could do it with reinforcement learning, which you’ve already got. My take on the motivated-reasoning-as-misapplied-reinforcement-learning theory is something like “we always knew people had to be doing self-deception somehow, I was previously puzzled by how this got implemented, but it turns out it’s a trivial corollary of this other much more fundamental program”. IV. Miscellaneous
You’re a free subscriber to Astral Codex Ten. For the full experience, become a paid subscriber. |
Older messages
Open Thread 211
Sunday, February 20, 2022
...
Mantic Monday: Ukraine Cube Manifold
Sunday, February 20, 2022
...
Book Review: Sadly, Porn
Sunday, February 20, 2022
...
The Gods Only Have Power Because We Believe In Them
Sunday, February 20, 2022
...
Austin Meetup Next Sunday
Sunday, February 20, 2022
...
You Might Also Like
A Very Strategist Thanksgiving
Thursday, November 28, 2024
The Strategist-recommended things on sale for Black Friday. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate
The best 200+ early Black Friday deals
Thursday, November 28, 2024
For your post-dinner scroll View in browser Ad The Recommendation Ad Gobble gobble deals deals Levi's Premium Wedgie Straight Fit Women's Jeans, Apple iPad Air (M2), and Saucony Guide 17. NYT
Friday Briefing: Day 2 of Lebanon’s cease-fire
Thursday, November 28, 2024
Plus, the gangs stealing hiring exams in India. View in browser|nytimes.com Ad Morning Briefing: Asia Pacific Edition November 29, 2024 Author Headshot By Gaya Gupta Good morning. We're covering
Silicon Aristotle
Thursday, November 28, 2024
Who Can Claim Aristotle? // Private Chefs For Silicon Valley's Elite Silicon Aristotle By Caroline Crampton • 28 Nov 2024 View in browser View in browser Who Can Claim Aristotle? Edith Hall | Aeon
How the Pilgrims differed from the Puritans
Thursday, November 28, 2024
+ how to avoid awkwardness at Thanksgiving table
♻️ Gratitude & Joy flow in a cycle
Thursday, November 28, 2024
Fun stuff for you to click on curated with joy by CreativeMornings HQ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
How “Y.O.L.O. Joe” Can Beat The Lame Duck
Thursday, November 28, 2024
Here is what Democrats could actually achieve in the months before Trump takes office. Need a productive political topic to discuss at the Thanksgiving table? Want to impart key facts as you pass the
Trump Cabinet Bomb Threats, Ancient Sandwiches, and a Popsicle Caper
Thursday, November 28, 2024
Several of President-elect Donald Trump's Cabinet nominees and administration appointees faced bomb threats and "swatting" attacks on Tuesday and Wednesday. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
First-ever UEFI bootkit for Linux in the works, experts say [Thu Nov 28 2024]
Thursday, November 28, 2024
Hi The Register Subscriber | Log in The Register Daily Headlines 28 November 2024 KITTY LOOKS AT SCREEN AI GENERATED First-ever UEFI bootkit for Linux in the works, experts say Bootkitty doesn't
On My Mind: Fig Ornaments and Striped Bath Mats
Thursday, November 28, 2024
Plus: Eensy-weensy, teeny-tiny gifts. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission. November 27,