The Spyglass Dispatch is a newsletter sent on weekdays featuring links and commentary on timely topics found around the web. Feel free to forward it to others who can sign up and view previous dispatches here. You can also view this particular dispatch on the web here.
This one is long. Blame Google, with their empty-the-cannons AI launches yesterday. And in general, everyone is clearly rushing to get stuff shipped before year end. That includes me, with my notebook of things to read and note... Which is to say, "if I had more time, I would have written a shorter letter."
I Google...🤖 Google Unveils A.I. Agent That Can Use Websites on Its Own – 'Mariner' is Google's first true agent in a similar vein to the one recently unveiled by Anthropic. But actually, it sounds more like the still-just-rumored OpenAI 'Operator' in that it's browser-based (as opposed to an app that can also use other apps, like Anthropic's). This makes sense given that Google fully controls Chrome – wait a minute – that may be a bit of an issue, optically, if nothing else. I'm guessing Google won't be fully baking this into Chrome as a result of their antitrust case and proposed remedy, but instead will keep it as an extension (noting that other AI products have extensions, such as OpenAI). Regardless, the framing reminds me of early self-driving cars – that is, a person still needs to be in the passenger seat to do some of the tasks. In fact, you apparently can't do anything else in that browser window while Mariner is at work. This guidance should help with the trust element here – especially since it's apparently taking screenshots and uploading them to Gemini in the cloud to analyze in real time, sound familiar? – but it also sounds a bit tedious. [NYT] 🤓 How Google's Project Astra Gives an AI Agent Eyes – Another agent Google showed off alongside the Gemini 2.0 unveil. But whereas 'Mariner' is confined to a web browser, 'Astra' wants to be free, out in the real world. This is not the first time Google has showed this project off, of course. But the world and technology seem more ready for it, perhaps. Again, you'll recall Microsoft's Recall (privacy debacle aside) mixed with Apple's Visual Intelligence mixed with Google's own longstanding work on Lens. While it all works on a smartphone, this is clearly technology meant for Smart/AR Glasses. Google doesn't officially make those – yet. Speaking of Visual Intelligence, it's the one big part of yesterday's iOS 18.2 roll-out I failed to mention – that's because it's iPhone-only, so I actually haven't been playing around with it in the betas unlike the other features (because I had it installed on an iPad). But it's arguably the most interesting AI Apple has released yet. Not just because of their own implementation, which is very Google Lens-y (and uses Google). But more so because it deeply integrates ChatGPT as well right on the main screen, which is... bold for Apple! [Axios] 🕵 Google Rolls Out Faster Gemini AI Model to Power Agents – Are you sensing the theme yet? Agents. Agents. Agents. Agents. The new "2.0" Gemini model is almost an after-thought in story after story. Even though these agents are only in limited testing right now – and, apparently, quite buggy and slow – Google clearly wants to flood the zone here and stake the early mindshare lead in the space, even though many have been talking about agents for much of the past year. Beyond the two mentioned above, 'Deep Research' is a tool that Gemini Advanced (read: paying) users can try right now. I used it to write a book report about myself and it was good. A solid B+. (Marks off for getting a few small details wrong, but I can see why, they're also wrong on the web.) It pulled a couple tidbits that surprised me, in a good way. Another agent, 'Jules' is focused on coding help, while there's also one for helping to play videogames. From the outside, it feels like OpenAI "won" the initial LLM model race (even if the benchmarks go back and forth) and also the consumer/developer market thanks to ChatGPT (even if others, like Anthropic's Claude, routinely get better marks). The race for agents is on – and presumably will be more fragmented, with different agents used for different purposes – as Google's own product roadmap can attest! [Bloomberg 🔒] 📸 Gemini 2.0 Flash: An Outstanding Multi-Modal LLM with a Sci-Fi Streaming Mode – Oh look, a story that's actually about the new model itself. Even just this first 'Flash' release is promising 2x speed improvements over Google's previous top-of-the-line model, Gemini 1.5 'Pro'. But they seem most impressed with the streaming API for the new model. "This lets you open up a two-way stream to the model sending audio and video to it and getting text and audio back in real time." This does look pretty killer... [Simon Willison's Weblog]
I Think...🗣️ Microsoft AI Chief Mustafa Suleyman: Conversational AI is the Next Web Browser – Moving on from Google, but also sort of staying with Google. While Suleyman is sort of a master of giving highly polished, diplomatic answers to nearly every question, Nilay Patel does a nice job pushing back at points to extract a few nuggets. Namely, on Microsoft's widely-reported increasingly contentious relationship with OpenAI, Suleyman does state very directly that while they'll now compete on current generation models, Microsoft will take a backseat to let OpenAI do their frontier work. Now, perhaps that's because OpenAI is already well down that path with their next model, and once it's "current gen", Microsoft will compete there as well – a concept also interesting given the notion of the "AI Wall" with regard to LLM pre-training. Microsoft may be happy to cede that race given that the race may be over! I did find his thoughts on the difference between working at Google vs. Microsoft (noting his current bias, of course) insightful. One takeaway: he clearly views Google as more reactive to the market than Microsoft is. Lastly, his thoughts on conversational AI being "the next browser; this is the next search engine" might help Google in that pesky antitrust case! [Verge] 🦾 Microsoft’s New Sales Pitch for AI: Spend Less Money on Humans – Sticking with Microsoft, it's interesting, but not surprising, that Microsoft (and others) is getting more explicit on this pitch to sell their AI products. It's one thing to try to talk through a demo of what an AI tool can do, which may or may not work, but giving companies a bottom-line savings number will always work. But the key is that the technology has to work, of course. It's early, but seems promising in certain areas right now. The move to agents should only make it more so. The other key in this pitch: it can't be about firing current workers (I mean, at some level, surely this will happen, but that's bad optically both for the companies and Microsoft, of course) but rather about the headcount you won't need to add going forward, that you otherwise would. Undoubtedly some will still take issue with this since AI is still, in a sense, taking jobs. But these are more theoretical jobs which are impossible to know if they would have ever been open. And, of course, the high-level notion remains that freeing up such jobs, alongside the AI tools, will make current employees far more productive. [Information 🔒] 🍪 Apple Is Working on AI Chip With Broadcom – The notion that Apple would be using their M-series chips in data centers to help power their AI products always seemed a bit odd and temporary. While powerful (and with their own 'Neural Engine' segment), they obviously weren't built for that purpose, and aren't architected like, say, NVIDIA's GPUs are. As such, Apple working with Amazon's new 'Trainium 2' chips – because they don't want to use NVIDIA's chips for whatever reason – makes sense. But what makes even more sense is for Apple to build their own chips fully tailored for AI. That's apparently what they're now doing with the help of Broadcom. The report is a little evasive in terms of Broadcom's actual role – are they just helping with the linking/networking aspect of the chip, their specialty, or helping more generally, as they do with Google's TPUs right now? (It feels like the former.) Also, it sounds like the sprint to get these chips out the door and into servers (notably for inference, not training, it sounds like), may (once again) delay/end any sort of "Extreme" variant of the M-series chips. Sort of wild how seemingly everyone is working to break NVIDIA's dominance of the market. But hardly a surprise given their position at the moment! Can they fully stay ahead? [Information 🔒] 💰 Mark Zuckerberg’s Meta Donates $1 Million to Trump’s Inaugural Fund – Look, I was fine with the dinner. That seems prudent, if nothing else. This... is a bit much. Quite literally. This isn't something Meta has done in the past. This isn't something most companies do. There's quite obviously no other point than to try to curry favor with the new administration. It's so blatant that maybe they just figured no one would care because no speculation is required here? It is what it is. I'm reminded of a scene from the film A Clear & Present Danger (as I seemingly always am on this topic!) when Jack Ryan gives the President some advice: "I would go in the other direction. If a reporter asks if you and Hardin were friends, I'd say, 'no, we were good friends' – if they asked if you're good friends, I'd say, 'no, no, we're lifelong friends' – I would give them no place to go. Nothing to report. No story." We're a long way from Zuckerberg aiming to be "neutral" in politics. That was August. This time I'll quote Taylor Swift – how's that for range in one blurb? – "August slipped away. Like a bottle of wine." [WSJ 🔒]
I Link...- Google also unveiled their new TPU, 'Trillium', which they say has 4x the training performance of its predecessor chip and was used "100%" to train Gemini 2.0. They also have apparently chained over 100,000 of them together, which seems to be the key number for such clusters these days... [VentureBeat]
- It seems like SiriusXM is in a bit of rough spot, business-wise. What might this mean for Howard Stern, one wonders – he's arguably worth even more to them now, but with his contract coming up, can they afford him now? [THR]
- Alongside other improvements, Yelp's integration with Apple Maps is getting more features, such as requesting a quote. This makes sense since Yelp is even more vital to the product now with Foursquare shutting down their guides... [9to5Mac]
- Sounds like Meet the Parents 4 is a go just in time for the 25th anniversary of the original. Incidentally co-produced by Red Hour Films, where I once worked a few lives ago. [THR]
- Where are we in the AI boom cycle? Exxon Mobil is now working on natural gas power plants specifically for data centers. These would be bespoke plants not connected to the broader grid, which would ease and speed development. [NYT]
- Apple's 2024 App Store Awards are led by Kino, the professional video app, as the overall iPhone app of the year. [MacRumors]
- Phew, a new streaming service, at last. CNBC+ is coming in Q1 2025... Joking aside, it feels like this one could make some sense as it's a good on-in-the-background channel in many a office. Also, given the spin-off from Comcast...[Variety]
- As everyone is well aware by now, M4-powered MacBook Airs are coming soon. What's strange is that code in the latest macOS update confirmed their existence ahead of an actual announcement (early next year). [MacRumors]
- GM's shuttering of Cruise won't just cause them to take a financial hit, Honda. Walmart, Softbank, and Microsoft (amongst others) now have to write off hundreds of millions of dollars in investments as well. The latter alone is doing an $800M write-down. The ouches keep coming. [TechCrunch]
- One non-AI feature of iOS 18.2 that has some people – notably, musicians – excited? Layered recording within the Voice Memos app... "I honestly think this is going to change the way that we make music forever. I really do," says Michael Bublé in the most Michael Bublé way imaginable... [9to5Mac]
I Quote..."The founding motto has been developing AI with eyes, ears and a voice, helping you in the real or the digital world."
-- Greg Wayne, a researcher at DeepMind giving the high-level mandate, which Google is clearly now shipping (or at least previewing) on all cylinders. The "real world" element in particular is interesting because Google, alongside Apple (and maybe Samsung), has perhaps a unique advantage thanks to their billion-plus devices in pockets around the world. As with the initial wave of LLM training perhaps slowing, that real data is undoubtedly going to be crucial for continued advancements. Which is to say: Meta, Microsoft, OpenAI, Anthropic, Amazon, and all the rest might still find themselves at a disadvantage in this new world due to smartphone dominance, once again. Unless some other devices can work – and ship – at scale. Glasses. AI Pins. Etc. (Or, I suppose, if regulators force Apple and Google to open their devices more...)
|