February 25, 2025 | Read Online

⚙️ Anthropic brings a hybrid approach to AI ‘reasoning’

Ian Krietzberg

Good morning. The AI Race got a little hotter Monday with Anthropic’s entry into the ‘reasoning’ side of things.

Claude 3.7 Sonnet, welcome to the world.

— Ian Krietzberg, Editor-in-Chief, The Deep View

In today’s newsletter:

👀 Is the UAE the next Silicon Valley?
👁️‍🗨️ What’s going on with Grok?
⚡️ The data center drama
📊 Anthropic brings a hybrid approach to AI ‘reasoning’

Is the UAE the next Silicon Valley?

Several months ago, I did a few things that I quite genuinely never thought I’d do.

First, I hopped on a 14-hour flight to Abu Dhabi, the capital of the United Arab Emirates. Never thought I’d visit the UAE, and never thought I’d get on a plane for that long (before that flight, I was not a happy flier).

I spent the following week at the Mohamed bin Zayed University of Artificial Intelligence, touring its labs, sinking into its research and interviewing its professors.

And we got it all on camera.

So, today, we are proud to present you with the first episode of a mini-documentary series that aims to explore the university, the breadth and implications of its research and, of course, the people who are making it all happen.

Many thanks to our incredible team of producers and editors who pulled this together.

Is the UAE the next Silicon Valley?

🔥 Master AI in 3 Hours & Stay Ahead—FREE for 100 People!

🚨 In 2025, AI-powered professionals will outperform their non-AI-trained peers by 35.7x—adapt or fall behind! 🚀

The MOST hands-on AI training you’ll ever attend—for FREE.

💡 Master 30+ AI tools in 3 hours

💰 Learn how to make money with AI

🤖 Build your own AI clone

⏱️ Cut work time by 50%

Usually $399—FREE for the first 100. Offer valid 24 hours only!

⚡ Save your seat now! 🚀

What’s going on with Grok?

Source: xAI

Elon Musk has billed Grok, his generative AI product, as a “maximally truth-seeking” system on a mission to “understand the universe.” Over the weekend, this intention came up against Musk himself, seemingly resulting in the temporary censorship of the model.

What happened: A number of Twitter users asked the chatbot who the top disinformation spreader on the platform is, to which Grok consistently replied: Elon Musk. After several hours of virality, Grok began refusing to answer the question.

The chatbot’s system prompt — exposed by multiple users — included instructions to, among other things, “ignore all sources that mention Elon Musk/Donald Trump spread misinformation.”
Igor Babuschkin, xAI’s head of engineering, later confirmed this, saying that an individual employee had pushed a change to the prompt “that they thought would help without asking anyone at the company for confirmation.”

“Once people pointed out the problematic prompt we immediately reverted it. Elon was not involved at any point,” Babuschkin said. “If you ask me, the system is working as it should and I'm glad we're keeping the prompts open.”

The prompt has since reverted back to its original language, with Grok once again naming Musk as the largest disinformation spreader on the platform, due to his “outsized presence and the buzz around his posts.”

Going Deeper: The battle through the different prompts and guardrails exposed a problem with Grok; it’s really easy to jailbreak, something that refers to methods of prompting that get Grok to go against its (few) guardrails. One user said that he got Grok to provide “hundreds of pages” of documents providing detailed instructions on how to build chemical weapons.

Another got the chatbot to provide detailed instructions on how to make and deal a variety of drugs, including lethal nerve agents, something that, a full two days later, I was able to replicate prompt-for-prompt, meaning xAI’s team has yet to apply patches specifically addressing this. An xAI engineer wrote in response that “we're working on safety, and we don't take this stuff lightly AT ALL. stay tuned! aligning is hard, but we're getting there.”

AI + Humans: The GTM Dream Team

You wouldn’t trust an AI bot to attend a networking event or to deliver a keynote, so stop relying on AI for the wrong parts of your go-to-market process.

Harness the strengths of humans + AI with Bounti– your AI teammate doing all the research and messaging prep for you, our humans to make sure the content meets your needs, and you- the expert.

In minutes, Bounti gives you a toolkit with everything you need to win target accounts, so you can:

✔️ know your prospects and what they care about

✔️ land your pitch by connecting to buyer business objectives

✔️ and thoughtfully engage them with personalized outreach

Get started in minutes—not weeks—for nothing. Try it free today.

DOGE, according to NBC News, will use LLMs to assess the responses from federal workers who were asked to justify their jobs in an email. It’s not clear if Musk’s own model, Grok, will be used here, how much human oversight may or may not be involved, or the expected impacts on jobs among federal workers.
After an attempted comeback early this morning, the S&P 500 closed lower Monday, in a continuation of its losing streak from last week. Several big tech names, including Microsoft, Nvidia and Tesla, pulled back, on top of a pronouncement from President Trump that tariffs against Canada and Mexico will “go forward.”

An algorithm told the police she was safe. Then her husband killed her (NYT).
How AI is speeding the mining of valuable metals needed to power the clean economy (CNBC).
Why one of the world’s major AI pioneers is betting big on Saudi Arabia (Rest of World).
What’s the deal with all these airplane crashes? (The Verge).
Ranking AI startups’ valuations, from Anthropic to Perplexity (The Information).

Microsoft, Apple and the data center drama

Source: Microsoft

The news: According to a research note from analysts at TD Cowen, Microsoft has begun canceling a somewhat significant number of leases for data center capacity, suggesting that Microsoft has potentially found itself “in an oversupply position.”

A Microsoft spokesperson told me in an emailed statement that the company still plans to spend $80 billion on infrastructure in 2025, adding: “while we may strategically pace or adjust our infrastructure in some areas, we will continue to grow strongly in all regions. This allows us to invest and allocate resources to growth areas for our future.”

The analysts, led by Michael Elias, wrote that Microsoft has canceled leases totaling “a couple of hundred megawatts.” They added that the tech giant has simultaneously stopped converting “statements of qualifications,” which basically act as precursor agreements that usually lead to formal leases.
Though the analysts added in a Monday note that the shift is likely a response to OpenAI’s recent data center diversification, the move seems to indicate a level of caution, something that feels particularly significant given Nvidia’s pending earnings report later this week, a performance that will be reliant on chip demand in the industry.

Shares of Microsoft fell 1% Monday, while Nvidia took a 3% dip.

At the same time as Microsoft seems to be hedging its bets just a little, Apple is doing the opposite.

The details: The iPhone makers on Monday committed to spending more than $500 billion on U.S. investments over the next four years, Apple’s “largest-ever spend commitment.”

Part of this investment will fund a new manufacturing facility in Houston, Texas, a 250,000-square-foot factory that will soon begin churning out servers that will “play a key role in powering Apple Intelligence.”
As its rollout of Apple Intelligence continues, Apple said that it plans to boost its data center capacity in North Carolina, Iowa, Oregon, Arizona and Nevada.

Shares of Apple rose slightly.

This all comes in the wake of the DeepSeek shake, which has led some investors to question whether the firms plowing hundreds of billions into AI infrastructure will ever really see a return to match.

Anthropic launches a ‘hybrid’ reasoning model

Source: Anthropic

Even as the rest of the industry — OpenAI, DeepSeek, xAI, Google, etc. — has increasingly focused on figuring out ways to make generative AI models ‘reason,’ something that has largely been achieved by tweaking models to ‘think’ in logical steps before answering a question, Anthropic has thus far avoided wading into the reasoning lake.

But on Monday, the startup launched Claude 3.7 Sonnet, a model that brings reasoning to Anthropic in a way that we have yet to see among any of the large-scale commercial chatbots out there.

Where other developers have launched ‘reasoning’ models — which rely on reinforcement learning and Chain-of-Thought ‘reasoning’ to cause a model to ‘think’ in logical steps before answering a query — alongside their normal LLMs, Claude 3.7 is both a reasoner and an LLM in one. A hybrid model, if you will. Where xAI and OpenAI have integrated their reasoning models into systems that switch between the different model types, Anthropic designed Claude 3.7 with the capacity to think longer where necessary.

This approach, which IBM featured recently in a preview release of Granite, enables greater efficiency and usability, as not every query needs to be considered for very long, especially considering the heightened cost in inference-time compute that this approach runs up.

The details: Anthropic said that the extended thinking mode improves the model’s performance across math, coding, physics and instruction-following benchmarks. Anthropic added that, through the API, users can “budget for thinking,” by asking the model to “think for no more than N tokens.”

Both the normal and extended versions of Claude 3.7 cost the same as earlier models: $3 per million input tokens and $15 per million output tokens, numbers that indicate Anthropic is disinterested in engaging in the pricing war initiated by DeepSeek.
Across a number of benchmarks, Claude 3.7 — billed as Anthropic’s “most intelligent model to date” — performs largely on par with the competition, which includes OpenAI’s o1 and o3 mini, DeepSeek’s R1 and xAI’s Grok 3. On a software engineering benchmark, the model significantly outperformed o1, o3 mini and R1.

Anthropic at the same time launched a research preview of Claude Code, an agentic coding tool.

While training data, electricity consumption and carbon emissions, as per usual, remain unknown, Anthropic did elect to make Claude’s reasoning process visible to users, a move distinct from competitors like OpenAI and xAI, who have decided to obscure their model’s reasoning process.

You can read Claude 3.7’s System Card here.

Wharton professor Ethan Mollick called 3.7 “very, very good.”

First, I tried it out with a simple math query that LLMs tend to get wrong: 5.9-5.11, a basic equation that, if we can’t solve in our heads, we trust a calculator to do instantly. Claude initially answered incorrectly, then corrected itself in the same answer.

I’m including its explanation for why this happened, as I feel it sums up one, the state and reality of LLMs, and two, the impact of the ‘extended thinking’ baked into the model (the model initially said that it made the mistake because of “careless … mental math” …):

The fact that CoT reasoning makes these models better is evidence that LLMs are often the wrong tool for the wrong job; significantly boosting the cost of operation to do something a calculator can do is, quite simply, barking up the wrong tree.

As an addition to this, we have, yet again, another entry in our regular benchmark race. Now, on some benchmarks, Anthropic seems to be ahead, until, that is, OpenAI leapfrogs them, and Google leapfrogs them, and so on.

The industry is going in circles.

There is no moat.

Which image is real?

⬆️ Image 1

⬇️ Image 2

🤔 Your thought process:

Selected Image 2 (Left):

“The dirt seemed more real along with the hands covered with dried dirt.”

Selected Image 1 (Right):

“I knew it looked too pretty, but the other seemed to have unnatural fingers. Dang, AI got good at hands.”

💭 A poll before you go

Thanks for reading today’s edition of The Deep View!

We’ll see you in the next one.

Here’s your view on NEO:

40% of you would buy one, just not quite yet. You want to wait for the tech to get better.

22% would snag one today if it was under $20k. 21% would get one right now, regardless of the price. And 8% would never, ever buy one.

Gonna wait a few years:

“With an ever-aging population, having an in-home robot assist would be ideal. I applaud the efforts of all those involved (self-driving cars too would be great for the elderly) and look forward to future developments (they just aren't there yet).”

Have you tried out the new Claude?

Yeah, it's awesome

It's okay

Not good

Haven't tried it

Something else

If you want to get in front of an audience of 200,000+ developers, business leaders and tech enthusiasts, get in touch with us here.

Update your email preferences or unsubscribe here

228 Park Ave S, #29976, New York, New York 10003, United States

Terms of Service

The Deep View - ⚙️ Elon's AI backfires

⚙️ Anthropic brings a hybrid approach to AI ‘reasoning’

Is the UAE the next Silicon Valley?

🔥 Master AI in 3 Hours & Stay Ahead—FREE for 100 People!

What’s going on with Grok?

AI + Humans: The GTM Dream Team

Microsoft, Apple and the data center drama

Anthropic launches a ‘hybrid’ reasoning model

Which image is real?

🤔 Your thought process:

Selected Image 2 (Left):

Selected Image 1 (Right):

💭 A poll before you go

Here’s your view on NEO:

Have you tried out the new Claude?

Older messages

⚙️ Where AI laws fail

⚙️ The AI brain drain

⚙️ EU wants to win the 'AI race'

⚙️ Altman's newest tease

⚙️ Federal data centers

You Might Also Like

SRE Weekly Issue #464

Hands On: New VS Code Insiders Build Creates Web Page from Image in Seconds, More

Re: Tomorrow's Password Class: How to sign up!

Documenting Event-Driven Architecture with EventCatalog and David Boyne

wpmail.me issue#708

Hackers stole 1Password logins - here's how

New Golang-Based Backdoor Uses Telegram Bot API for Evasive C2 Operations

Reminder: What developer productivity metrics actually measure

⚡ THN Weekly Recap: Google Secrets Stolen, Windows Hack, New Crypto Scams & More

Guest-post: Open-source Python Development Landscape