Highlights From The Comments On Motivated Reasoning And Reinforcement Learning
I. Comments From People Who Actually Know What They’re Talking About Gabriel writes:
That comment by Fox is:
I agree that if we were perfect Bayesian reasoners, the knowledge that there was now a 5% chance of there being a lion would propagate throughout all brain regions and they could condition on this immediately. And yet a few days ago, I (on a diet) visited some friends who sometimes leave delicious brownies on their counter. I worried that if I saw the brownies, I would eat them, so I tried not to look at the counter. But part of me felt bad that I was passing up the opportunity to eat delicious brownies, so my split-second reaction as I walked through their kitchen was to compromise by looking towards the edge of their counter to check for brownies, but to deliberately exclude from my vision the part of the counter where the brownies were most likely to be. This makes me think that the parts of my brain doing active inference are not quite perfect Bayesians making perfect updates.
But check out the rest of the comment subthread for some pushback against and clarification of this model. II. Arguments That The Long-Term Rewards Of Spotting The Lion Outweigh The Short-Term Drawbacks Here are three comments that I think say about the same thing from different angles. Phil:
KJZ:
Mike:
I feel like we can thought-experiment our way out of this. Suppose I invest in Bitcoin, then check its price every day. There is a little up arrow or down arrow next to some number and percent. Some days it’s a green up arrow and I feel good and smart and rich. Other days it’s a red down arrow and I feel bad and dumb and poor. None of this ever gets confirmed by any kind of ground truth, because I am HODLing and will never sell my Bitcoins until I retire. So how come I don’t start hallucinating that the arrow is green and points up? Every time I’ve “taken the action” of seeing a green upward-pointing arrow, I’ve felt better; every time I’ve taken the opposite action, I’ve felt worse! You can no longer appeal to the “the ultimate reinforcement is whether you got mauled by a lion or not”, because I’ve never sold my Bitcoin and gotten any form of reinforcement more final than checking the arrow (if you want, imagine that I get hit by a truck at age 64 and never sell the Bitcoin). I don’t want to say “epistemics are protected from reinforcement learning” is the only way out of this. It could be that the visual cortex gets reinforced at the level of broad principles, and any change that caused you to flip the direction and color of the arrow would have to change really fundamental things that would make your vision worse in other ways. But it doesn’t seem like “ultimate reinforcement” is what’s preventing this from happening, since there is none. Also, behavioral reinforcement learning is nowhere near this good. You might think that the short-term reward of eating brownies wouldn’t change behavior because the real reward we should be considering is the reward of being healthy and looking good. But this works very inconsistently, as opposed to the “see lions as lions” thing which works all the time. III. Am I Ignoring The Many Practical Reasons For People To Have Motivated Reasoning?
I commented:
XPYM asks:
This answers the “why” question but not the “how” question. If you wonder why animals can see, the answer is “it’s useful for spotting food and predators and stuff”. If you wonder how animals can see, the answer is a giant ophthalmology textbook and lots of stuff about rods and cones. One of the ideas that’s had the biggest effect on me recently is thinking about how small the genome is and how poorly it connects to the brain. It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”. But in order to do that, you need some specific thing that happens in the genome - an adenine switched to a guanine, or something - to give people a desire to suck up to high-status leaders. Some change in the conformation of a protein has to change the wiring of the brain in some way such that people feel like sucking up to high-status leaders is a good idea. This isn’t impossible - evolution has managed weirder things - but it’s so, so hard. Humans have like 20,000 genes. Each one codes for a protein. Most of those proteins do really basic things like determine how flexible the membrane of a kidney cell should be. You can’t just have the “how you behave towards high status leaders” protein shift into the “suck up to them” conformation, that’s not how proteins work! You should penalize theories really heavily for every piece of information that has to travel from the genome to the brain. It certainly should be true that people try to spin things in self-serving ways: this is Trivers’ theory of self-deception and consciousness as public relations agent. But that requires communicating an entire new philosophy of information processing from genome to brain. Unless you could do it with reinforcement learning, which you’ve already got. My take on the motivated-reasoning-as-misapplied-reinforcement-learning theory is something like “we always knew people had to be doing self-deception somehow, I was previously puzzled by how this got implemented, but it turns out it’s a trivial corollary of this other much more fundamental program”. IV. Miscellaneous
You’re a free subscriber to Astral Codex Ten. For the full experience, become a paid subscriber. |
Older messages
Open Thread 211
Sunday, February 20, 2022
...
Mantic Monday: Ukraine Cube Manifold
Sunday, February 20, 2022
...
Book Review: Sadly, Porn
Sunday, February 20, 2022
...
The Gods Only Have Power Because We Believe In Them
Sunday, February 20, 2022
...
Austin Meetup Next Sunday
Sunday, February 20, 2022
...
You Might Also Like
☕ Great chains
Wednesday, January 15, 2025
Prologis looks to improve supply chain operations. January 15, 2025 View Online | Sign Up Retail Brew Presented By Bloomreach It's Wednesday, and we've been walking for miles inside the Javits
Pete Hegseth's confirmation hearing.
Wednesday, January 15, 2025
Hegseth's hearing had some fireworks, but he looks headed toward confirmation. Pete Hegseth's confirmation hearing. Hegseth's hearing had some fireworks, but he looks headed toward
Honourable Roulette
Wednesday, January 15, 2025
The Honourable Parts // The Story Of Russian Roulette Honourable Roulette By Kaamya Sharma • 15 Jan 2025 View in browser View in browser The Honourable Parts Spencer Wright | Scope Of Work | 6th
📬 No. 62 | What I learned about newsletters in 2024
Wednesday, January 15, 2025
“I love that I get the chance to ask questions and keep learning. Here are a few big takeaways.” ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
⚡️ ‘Skeleton Crew’ Answers Its Biggest Mystery
Wednesday, January 15, 2025
Plus: There's no good way to adapt any more Neil Gaiman stories. Inverse Daily The twist in this Star Wars show was, that there was no twist. Lucasfilm TV Shows 'Skeleton Crew' Finally
I Tried All The New Eye-Shadow Sticks
Wednesday, January 15, 2025
And a couple classics. The Strategist Beauty Brief January 15, 2025 Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission
How To Stop Worrying And Learn To Love Lynn's National IQ Estimates
Wednesday, January 15, 2025
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
☕ Olympic recycling
Wednesday, January 15, 2025
Reusing wi-fi equipment from the Paris games. January 15, 2025 View Online | Sign Up Tech Brew It's Wednesday. After the medals are awarded and the athletes go home, what happens to all the stuff
Ozempic has entered the chat
Wednesday, January 15, 2025
Plus: Hegseth's hearing, a huge religious rite, and confidence. January 15, 2025 View in browser Jolie Myers is the managing editor of the Vox Media Podcast Network. Her work often focuses on
How a major bank cheated its customers out of $2 billion, according to a new federal lawsuit
Wednesday, January 15, 2025
An explosive new lawsuit filed by the Consumer Financial Protection Bureau (CFPB) alleges that Capital One bank cheated its customers out of $2 billion. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏