Tedium - When Robots Aggregate 🤖

Breaking down an AI-aggregation battle.

Hunting for the end of the long tail • June 20, 2024

When Robots Aggregate

The mess between Forbes and Perplexity AI highlights how soulless and extractive aggregation can be in the wrong hands. It’s the wrong direction for LLMs.

Having thought about it, I feel my stance on AI is perhaps not as negative as some of my peers. I think it’s overhyped, sure, but I’m not in full doomer mode. In a narrow context, AI has real benefits.

The problem is, the people implementing and making the technology have let the hype outpace the potential benefits of the technology, while implementing it in ways that clearly undermine underlying systems—and norms. (I hear the CEO of Zoom has 50 meetings today.)

Adobe’s a great example. The company exists to serve creative people, but has recently focused on products apparently designed not for creativity (where I saw some potential early on) but for optimizing the role of creative output so humans are less necessary. AI has been sold as a tool for optimization, for extracting value, when it should be sold as a tool to build upon our creativity, and the average user is very quickly figuring this out.

TLDR

Want a byte-sized version of Hacker News? Try TLDR’s free daily newsletter.

TLDR covers the most interesting tech, science, and coding news in just 5 minutes.

No sports, politics, or weather.

Subscribe for free!

Recently, a debate has cropped up in journalistic circles that highlights the way tech firms push large language models to replace our creativity, rather than build upon it. Last month, Perplexity AI launched a feature called Pages that promised to be “your new tool for easily transforming research into visually stunning, comprehensive content.”

Essentially, it is a tool for writing miniature book reports about what you’ve been researching. It is a robot that makes the SEO-traffic-generating pages itself. Which sounds like the AI replacing the work of actual people, and extracting their value in the process.

This was bad enough, but then an editor at Forbes noticed that, a mere week after this feature launched, the company had taken a long, paywalled report about former Google CEO Eric Schmidt’s interest in drones, and essentially rewrote it, with the credit buried in a way that would discourage people from clicking further. And that piece got a lot of traffic—little, if any, went to Forbes.

Debates about hiding sourcing in aggregation aren’t new. As I wrote about five years ago, modern-day journalistic gadfly Michael Wolff upset fellow journalists with his aggregation site Newser, which made little effort to link its sources and ultimately extracted real reporting resources to support its own traffic.

In many ways, Perplexity has invented an automated version of circa-2010 Newser. And Forbes was pissed. Perplexity CEO Aravind Srinivas tried to defend what the company did, writing, “It has rough edges, and we are improving it with more feedback.”

But Forbes isn’t having it, sending a formal legal threat to the company this week.

Now to be clear, aggregation was always something of an ethical black hole. Many of your favorite writers started with aggregation, this writer included. Blogging, at its heart, is taking information you found elsewhere—through other links, via experiences, and elsewhere—and aggregating it to your readership. Sites like Gawker gained most of their traffic through aggregation, but they did so with style.

Journalists who aggregate—at least the ones who care about their jobs—tend to know the limits to what they can take, as well as the role that their individual perspective brings to the content. (Some might point out that Forbes spent the 2010s focused less on its own quality work than on similar race-to-the-bottom endeavors, but the report taken by Perplexity’s book-report engine was very much not that.)

(Nick Fewings/Unsplash)

Aggregation can be done well, and creatively, in a way that benefits both the developer of the aggregation and the reporting. I’d like to think my old site ShortFormBlog, with its prominent “source” links, was additive aggregation at heart.

But what Perplexity AI is doing is not that, and its “rough edges” complaint highlights that point neatly. It is essentially taking information from other sources, re-reporting the information using AI, minimizing outside links, siphoning the traffic, and blunting any criticism of what they’ve built as “rough edges.” It’s not additive—it’s not like the bots are doing interviews, commentating on the report, or highlighting unexpected perspectives. They are literally reporting the thing—no more, no less.

Despite using large-language tools built to generate content, it doesn’t make anything. In fact it exploits the guardrails of the internet, per Wired—the tool doesn’t even respect the guardrail of robots.txt.

Admittedly, the whims of the algorithm also stymie human writers, too. As friend of Tedium Jason Koebler wrote at 404 Media, our tendency to get caught up in whatever Elon Musk is doing shows human journalists’ tendency to lean on low-hanging fruit traffic built on the thinnest of ideas. This stuff gets traffic without a lot of output. As Koebler wrote:

This is not to denigrate the journalists who are writing these articles. Most of them are working in large companies where their bosses’ bosses’ bosses are trying to squeeze the remaining blood out of a programmatic advertising traffic stone that has been near dry for years and threatens to evaporate entirely as AI answers replace normal search results and social media further frays. Most of their bosses’ bosses’ bosses probably hope to replace these human beings who are writing Elon Musk tweets a thing articles with an AI generating Elon Musk tweets a thing articles.

Stuff like Perplexity’s Pages is how we get to the bosses’ bosses’ dream of AI-generated Elon Musk tweets a thing articles. But more importantly, it’s also how we turn the fruit of months-long reporting endeavors into computer-generated articles that produce the information with none of the flair (not even its own flair) and none of the work.

If we’re going to embrace what large-language models do, we should have higher standards or choose not to use it at all.

Human-Generated Links

Jay Hoffman’s excellent The History Of The Web has a great piece on the rise of SEO and the trademark controversy that nearly put the common term in a single person’s ownership.

If, for some reason, you are still unconvinced of the brilliance of Weird Al Yankovic as a satirist and parodist, I present “[This Song's Just] Six Words Long,” which has a chorus that’s seven words long. That’s a direct reference to its source material—George Harrison’s cover of “Got My Mind Set On You,” a song with six words in the title, but seven words in the chorus.

Harry McCracken, a fellow fan of 1994 tech, published an excellent piece about the multimedia CD-ROM’s coming-out party. Encarta was such a trip.

I am starting a quiet campaign to get Threads to support following people in the fediverse. Who’s going to join me?

--

Find this one fascinating? Share it with a pal!

And if you’re looking for a tech-news roundup, TLDR is a great choice. Give ’em a look!

Share this post:

follow on Twitter | privacy policy | advertise with us

Copyright © 2015-2024 Tedium, all rights reserved.

Disclosure: From time to time, we may use affiliate links in our content—but only when it makes sense. Promise.

unsubscribe from this list | view email in browser | sent with Email Octopus

Older messages

Artifacting 🖼️

Sunday, June 16, 2024

How the JPEG gradually compressed our world. Here's a version for your browser. Hunting for the end of the long tail • June 16, 2024 Today in Tedium: I fully admit it—I stretch images. I also

CHIP FIGHT GO! 💻

Friday, June 14, 2024

Qualcomm and Arm are fighting. The likely winners? Consumers. Here's a version for your browser. Hunting for the end of the long tail • June 14, 2024 CHIP FIGHT GO! The emergence of a conflict

Adobe’s Slow Decay 🫠

Tuesday, June 11, 2024

Adobe's real problem isn't the privacy policy mess. Here's a version for your browser. Hunting for the end of the long tail • June 11, 2024 Adobe's Slow Decay The problem with Adobe is

The Barleycorn Measurement Scheme 👟

Sunday, June 9, 2024

The history of the shoe-sizing Brannock Device. Here's a version for your browser. Hunting for the end of the long tail • June 09, 2024 Hey all, Ernie here with a piece from 2019 that started with

Let’s Bring Back Small Tools 🔧

Saturday, June 8, 2024

Why I think small tools matter on the internet right now. Here's a version for your browser. Hunting for the end of the long tail • June 07, 2024 Let's Bring Back Small Tools Thoughts on the

You Might Also Like

Retro Recomendo: Gift Ideas

Sunday, November 24, 2024

Recomendo - issue #438 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Kotlin Weekly #434

Sunday, November 24, 2024

ISSUE #434 24th of November 2024 Hi Kotliners! Next week is the last one to send a paper proposal for the KotlinConf. We hope to see you there next year. Announcements State of Kotlin Scripting 2024

Weekend Reading — More time to write

Sunday, November 24, 2024

More Time to Write A fully functional clock that ticks backwards, giving you more time to write. Tech Stuff Martijn Faassen (FWIW I don't know how to use any debugger other than console.log) People

🕹️ Retro Consoles Worth Collecting While You Still Can — Is Last Year's Flagship Phone Worth Your Money?

Saturday, November 23, 2024

Also: Best Outdoor Smart Plugs, and More! How-To Geek Logo November 23, 2024 Did You Know After the "flair" that servers wore—buttons and other adornments—was made the butt of a joke in the

JSK Daily for Nov 23, 2024

Saturday, November 23, 2024

JSK Daily for Nov 23, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Not Ready For The Camera 📸

Saturday, November 23, 2024

What (and who) video-based social media leaves out. Here's a version for your browser. Hunting for the end of the long tail • November 23, 2024 Not Ready For The Camera Why hasn't video

Daily Coding Problem: Problem #1617 [Easy]

Saturday, November 23, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.

Ranked | The Tallest and Shortest Countries, by Average Height 📏

Saturday, November 23, 2024

These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3

⚙️ Your own Personal AI Agent, for Everything

Saturday, November 23, 2024

November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital

Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?

Saturday, November 23, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon