Tedium - When Robots Aggregate 🤖

Breaking down an AI-aggregation battle.

Hunting for the end of the long tail • June 20, 2024

When Robots Aggregate

The mess between Forbes and Perplexity AI highlights how soulless and extractive aggregation can be in the wrong hands. It’s the wrong direction for LLMs.

Having thought about it, I feel my stance on AI is perhaps not as negative as some of my peers. I think it’s overhyped, sure, but I’m not in full doomer mode. In a narrow context, AI has real benefits.

The problem is, the people implementing and making the technology have let the hype outpace the potential benefits of the technology, while implementing it in ways that clearly undermine underlying systems—and norms. (I hear the CEO of Zoom has 50 meetings today.)

Adobe’s a great example. The company exists to serve creative people, but has recently focused on products apparently designed not for creativity (where I saw some potential early on) but for optimizing the role of creative output so humans are less necessary. AI has been sold as a tool for optimization, for extracting value, when it should be sold as a tool to build upon our creativity, and the average user is very quickly figuring this out.

TLDR

Want a byte-sized version of Hacker News? Try TLDR’s free daily newsletter.

TLDR covers the most interesting tech, science, and coding news in just 5 minutes.

No sports, politics, or weather.

Subscribe for free!

Recently, a debate has cropped up in journalistic circles that highlights the way tech firms push large language models to replace our creativity, rather than build upon it. Last month, Perplexity AI launched a feature called Pages that promised to be “your new tool for easily transforming research into visually stunning, comprehensive content.”

Essentially, it is a tool for writing miniature book reports about what you’ve been researching. It is a robot that makes the SEO-traffic-generating pages itself. Which sounds like the AI replacing the work of actual people, and extracting their value in the process.

This was bad enough, but then an editor at Forbes noticed that, a mere week after this feature launched, the company had taken a long, paywalled report about former Google CEO Eric Schmidt’s interest in drones, and essentially rewrote it, with the credit buried in a way that would discourage people from clicking further. And that piece got a lot of traffic—little, if any, went to Forbes.

Debates about hiding sourcing in aggregation aren’t new. As I wrote about five years ago, modern-day journalistic gadfly Michael Wolff upset fellow journalists with his aggregation site Newser, which made little effort to link its sources and ultimately extracted real reporting resources to support its own traffic.

In many ways, Perplexity has invented an automated version of circa-2010 Newser. And Forbes was pissed. Perplexity CEO Aravind Srinivas tried to defend what the company did, writing, “It has rough edges, and we are improving it with more feedback.”

But Forbes isn’t having it, sending a formal legal threat to the company this week.

Now to be clear, aggregation was always something of an ethical black hole. Many of your favorite writers started with aggregation, this writer included. Blogging, at its heart, is taking information you found elsewhere—through other links, via experiences, and elsewhere—and aggregating it to your readership. Sites like Gawker gained most of their traffic through aggregation, but they did so with style.

Journalists who aggregate—at least the ones who care about their jobs—tend to know the limits to what they can take, as well as the role that their individual perspective brings to the content. (Some might point out that Forbes spent the 2010s focused less on its own quality work than on similar race-to-the-bottom endeavors, but the report taken by Perplexity’s book-report engine was very much not that.)

(Nick Fewings/Unsplash)

Aggregation can be done well, and creatively, in a way that benefits both the developer of the aggregation and the reporting. I’d like to think my old site ShortFormBlog, with its prominent “source” links, was additive aggregation at heart.

But what Perplexity AI is doing is not that, and its “rough edges” complaint highlights that point neatly. It is essentially taking information from other sources, re-reporting the information using AI, minimizing outside links, siphoning the traffic, and blunting any criticism of what they’ve built as “rough edges.” It’s not additive—it’s not like the bots are doing interviews, commentating on the report, or highlighting unexpected perspectives. They are literally reporting the thing—no more, no less.

Despite using large-language tools built to generate content, it doesn’t make anything. In fact it exploits the guardrails of the internet, per Wired—the tool doesn’t even respect the guardrail of robots.txt.

Admittedly, the whims of the algorithm also stymie human writers, too. As friend of Tedium Jason Koebler wrote at 404 Media, our tendency to get caught up in whatever Elon Musk is doing shows human journalists’ tendency to lean on low-hanging fruit traffic built on the thinnest of ideas. This stuff gets traffic without a lot of output. As Koebler wrote:

This is not to denigrate the journalists who are writing these articles. Most of them are working in large companies where their bosses’ bosses’ bosses are trying to squeeze the remaining blood out of a programmatic advertising traffic stone that has been near dry for years and threatens to evaporate entirely as AI answers replace normal search results and social media further frays. Most of their bosses’ bosses’ bosses probably hope to replace these human beings who are writing Elon Musk tweets a thing articles with an AI generating Elon Musk tweets a thing articles.

Stuff like Perplexity’s Pages is how we get to the bosses’ bosses’ dream of AI-generated Elon Musk tweets a thing articles. But more importantly, it’s also how we turn the fruit of months-long reporting endeavors into computer-generated articles that produce the information with none of the flair (not even its own flair) and none of the work.

If we’re going to embrace what large-language models do, we should have higher standards or choose not to use it at all.

Human-Generated Links

Jay Hoffman’s excellent The History Of The Web has a great piece on the rise of SEO and the trademark controversy that nearly put the common term in a single person’s ownership.

If, for some reason, you are still unconvinced of the brilliance of Weird Al Yankovic as a satirist and parodist, I present “[This Song's Just] Six Words Long,” which has a chorus that’s seven words long. That’s a direct reference to its source material—George Harrison’s cover of “Got My Mind Set On You,” a song with six words in the title, but seven words in the chorus.

Harry McCracken, a fellow fan of 1994 tech, published an excellent piece about the multimedia CD-ROM’s coming-out party. Encarta was such a trip.

I am starting a quiet campaign to get Threads to support following people in the fediverse. Who’s going to join me?

--

Find this one fascinating? Share it with a pal!

And if you’re looking for a tech-news roundup, TLDR is a great choice. Give ’em a look!

Share this post:

follow on Twitter | privacy policy | advertise with us

Copyright © 2015-2024 Tedium, all rights reserved.

Disclosure: From time to time, we may use affiliate links in our content—but only when it makes sense. Promise.

unsubscribe from this list | view email in browser | sent with Email Octopus

Older messages

Artifacting 🖼️

Sunday, June 16, 2024

How the JPEG gradually compressed our world. Here's a version for your browser. Hunting for the end of the long tail • June 16, 2024 Today in Tedium: I fully admit it—I stretch images. I also

CHIP FIGHT GO! 💻

Friday, June 14, 2024

Qualcomm and Arm are fighting. The likely winners? Consumers. Here's a version for your browser. Hunting for the end of the long tail • June 14, 2024 CHIP FIGHT GO! The emergence of a conflict

Adobe’s Slow Decay 🫠

Tuesday, June 11, 2024

Adobe's real problem isn't the privacy policy mess. Here's a version for your browser. Hunting for the end of the long tail • June 11, 2024 Adobe's Slow Decay The problem with Adobe is

The Barleycorn Measurement Scheme 👟

Sunday, June 9, 2024

The history of the shoe-sizing Brannock Device. Here's a version for your browser. Hunting for the end of the long tail • June 09, 2024 Hey all, Ernie here with a piece from 2019 that started with

Let’s Bring Back Small Tools 🔧

Saturday, June 8, 2024

Why I think small tools matter on the internet right now. Here's a version for your browser. Hunting for the end of the long tail • June 07, 2024 Let's Bring Back Small Tools Thoughts on the

You Might Also Like

Daily Coding Problem: Problem #1495 [Hard]

Saturday, July 13, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Netflix. A Cartesian tree with sequence S is a binary tree defined by the following two

Weekend Reading — Tactical Assault Shiba

Saturday, July 13, 2024

This week we're headed to the JSON station, we find a lighter and faster alternative to lodash, steal like an artists, generate better prompts than an Oath Keeper, discover Britain's new ruler,

Charted | The Hottest and Coldest Temperatures in U.S. History 🌡️

Saturday, July 13, 2024

This graphic shows the hottest and coldest temperatures in US history, with temperatures ranging from -80°F (-62.2°C) to 134.4°F (56.7°C). View Online | Subscribe Voronoi: The App Where Data Tells the

Gone Are Those Days of AI

Saturday, July 13, 2024

Top Tech Content sent at Noon! Get Algolia: AI Search that understands Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, July 13, 2024? The HackerNoon

If you’re an AT&T customer, your data has likely been stolen

Saturday, July 13, 2024

Plus, what Samsung revealed at its Galaxy Unpacked event and more View this email online in your browser By Cody Corrall Saturday, July 13, 2024 Image Credits: Jeenah Moon/Bloomberg / Getty Images This

🐍 New Python tutorials on Real Python

Saturday, July 13, 2024

Hey there, There's always something going on over at realpython.com as far as Python tutorials go, of course. And, this week we're also on the hunt for exceptional talent to join the Real

You're Invited: Best Apps Free Class

Saturday, July 13, 2024

Do you love the feeling of discovering the perfect app that does exactly what you need it to do? But finding the right one can be like searching for a needle in a haystack! That's why I'm

Palo Alto Networks Releases Patch for Critical Flaw in Expedition Migration Tool

Saturday, July 13, 2024

THN Daily Updates Newsletter cover Artificial Intelligence Programming with Python: From Zero to Hero ($24.00 Value) FREE for a Limited Time A hands-on roadmap to using Python for artificial

Cash In on Your Writing Skills: Explore HackerNoon's $26K+ Prize Pool!

Saturday, July 13, 2024

Hello again, Hacker💚 We know you're passionate about covering tech, sharing insights, and creating tutorials to spread knowledge. But what if you could turn your writing into cash prizes? Today, we

📧 Building Your First Use Case With Clean Architecture

Saturday, July 13, 2024

​ Building Your First Use Case With Clean Architecture Read on: m​y website / Read time: 7 minutes BROUGHT TO YOU BY ​ The First .NET Low-Code Development Platform ​ Introducing Shesha, a brand new,