Tedium - Spooky Database Relations 👻

The most boring Halloween story ever.

Hunting for the end of the long tail • October 27, 2023

Today in Tedium: Love them or hate them, relational databases are one of the most fundamental building blocks of modern computing. Organizing data in tables and accessing it later on? While it often gets more complex than that—we’ve, after all, figured out numerous ways to access data over the years—databases work largely like this, all these years later. They are boring, but they make lots of money. After all, there’s a reason Larry Ellison is the fourth-richest person in the world. When you mention databases, you do not think about Halloween. But yet … it turns out that one of the most famous examples of a database bug is named after Halloween. Today’s Tedium explains why. — Ernie @ Tedium

If you think about it, a pumpkin patch is kind of like a relational database. (Marius Ciocirlan/Unsplash)

The boring database problem that showed itself on Halloween

It was normal day in a dark, spooky cubicle in 1976 when a group of database engineers working at the IBM Almaden Research Center discovered a problem with the database they were attempting to test.

On the surface, the query was fairly straightforward: They set up a query in the IBM’s designed to increase the salary for every employee making less than $25,000 by 10 percent. On the surface, the database change seemed to work, and reported no errors. But like a shadowy figure dressed up like a skeleton, appearances were deceiving.

See, it turns out that the query worked too well. It didn’t just update the salaries by 10 percent—it kept looping through the database, continually increasing every modest salary by 10 percent until everyone made at least $25,000 a year. Essentially, the database was not going through once, as intended, but was pushing the change to the front of the queue, ensuring it would keep hitting it again and again until it was no longer valid in the system.

In the real world, a mistake of this nature would be a massive windfall for employees—but a nightmare for executives. Nonetheless, the date the mistake was discovered immortalized the database problem as the Halloween problem.

In a retrospective joint interview dating to 1995, Mike Blasgen, one of the systems managers who worked at IBM Almaden back in the 1970s, stated that it was one of many phantoms that were discovered during this era:

It’s interesting because all these odd-ball things had names: there were phantoms, and there were other things, and those had to do with names that were somehow representative of what you were observing, right? So the phantom was because it was something that was sort of there, but not there; the name was descriptive. And this was called the Halloween problem not because it surprises you, or it’s spooky, or trick-or-treat or anything; this is because it happened to be discovered on Halloween day. But I think most people think it’s the other; I think most people think it’s called Halloween because it’s so surprising. But it’s not.

Don Chamberlin, who was part of the team that uncovered the problem, noted that the term took on a life of its own.

“It’s famous in the industry—everybody knows the Halloween problem,” he recalled.

Over the years, this problem, thanks in part to how early it happened in the history of technology and where it happened, evolved from a silly example into a classic discussion point of SQL database books. It’s a great way to touch on a whole lot of problems with querying databases thoughtlessly.

And in many ways, the nature of the Halloween problem is relatively easy to understand for a database bug, which is why it sometimes shows itself in mainstream popular culture—albeit not under its proper name. For example, the movie Office Space is built around a failed attempt at white-collar crime that failed because it ran into a problem very similar to the Halloween problem. (To explain it I’m going to have to spoil the movie, but you’ve had 25 years to watch it.)

Late in the film, the three main characters—Peter Gibbons, Michael Bolton, and Samir Nagheenanajar—decide to exact revenge on their company Initech by infecting the company’s accounting system with a virus that removes fractions of pennies from the company’s systems and puts it into a bank account. It’s intended to work slowly over a long period, but because of a bug in the code, the virus ends up taking more than $300,000 in the span of a single weekend—a classic example of overquerying if you’ve ever seen one.

Suddenly, what was intended to be a passive, nearly invisible database query had instead hit the system hundreds of millions of times in just a few days.

While Bolton (David Herman) suggests he might have messed up a decimal point, a query error in a handwritten virus almost feels more likely, knowing what we know about the Halloween problem.

“I always do that. I always mess up some mundane detail,” he says.

(Good thing the stapler enthusiast Milton set the office on fire, ensuring all records of the gaffe were lost in the blaze.)

“It happened to be on a Friday, and we said, ‘Listen, we are not going to be able to solve this problem this afternoon. Let’s just give it a name. We’ll call it the Halloween Problem and we’ll work on it next week.’ And it turns out it has been called that ever since.”

— Don Chamberlin, one of the key figures involved in the Halloween problem saga, on how the term got its name. “It has absolutely nothing to do with Halloween,” he said in the 2001 oral history. In other words, they literally named it after Halloween because it was inconvenient to fix at the time they had discovered it.

Don’t let the candy corn fool you! This is still an article about troubling flaws in relational databases. I’m putting this here to remind you of what a Halloween-themed article should actually be about. (sambeawesome/Pixabay)

Why the Halloween problem is kind of a headache, even if it’s not particularly scary

So now that we’ve laid out that this is the most boring Halloween-related thing to happen, ever, let’s talk about why it’s a problem for database management, in layperson terms.

Essentially, because you cannot just straight-query an SQL database when trying to change a number of fields, lest you risk changing data you don’t want to change, you have to attack it using another method. The problem is, these methods are generally more complex than the obvious one, and that means they can either be slower or harder to implement.

In a 2013 piece on the website SQLPerformance.com, author Paul White explained the many nuances of this problem, as well as the challenges it creates from a performance standpoint to query an SQL database without changing data accidentally.

With regards to the UPDATE query, the approach used in the classic 1976 example, there are a couple of ways to do this, per White:

  • Query the database using a read-only search, then change only the records that are affected, doing an analysis on the backend. “Implementing these three phases literally in a database engine would produce correct results, but performance might not be very good,” White writes. “The intermediate results at each stage will require system memory, reducing the number of queries the system can execute concurrently. ”

  • Only do updates one row at a time, to ensure that changes are isolated from one another. While more performant, this process is more complex, per White: “The challenge for the query optimizer is to find an iterative (row by row) execution plan that satisfies the UPDATE semantics required by the SQL standard, while retaining the performance and concurrency benefits of pipelined execution.”

(If you want to get into the weeds of this concept, I recommend White’s four-part series, as it explains the nuances of this better than I can. Warning though: It may go over your head.)

Perhaps it’s for this reason that attempts have been made over the years to mitigate the problem. In particular, Microsoft has a number of patent filings to its name that exist specifically to mitigate the Halloween problem. (Which makes sense, as Microsoft sells an SQL server tool.) The first, granted in 2000, notes that the goal is to simply avoid the problem outright:

There exists a need for a more efficient approach to avoiding the Halloween problem. In particular, there exists a need for a query optimizer that is capable of interleaving record-at-a-time pipelining and set-at-a-time pipelining in order to achieve more efficient performance by permitting as many efficient update plans as possible while still maintaining the correct semantics of search and then update.

Over the years, Microsoft has developed a variety of solutions to take on this issue. In 2008, the company’s Craig Freedman, who is responsible for some of the patents credited to Microsoft, noted that SQL Server has to separate processes to ensure that changes of this nature occur with a minimum of damage, a concept called “Halloween protection.”

“Logically speaking, SQL Server must execute the read cursor and write cursor of an update plan in two separate steps or phases,” he wrote. “To put it another way, the actual update of rows must not affect the selection of which rows to update.”

To put it all another way, the Halloween problem is essentially the result of a common database architecture working contrary to the way you’d logically expect it to, creating numerous extra hoops along the way.

Talk about cursed.

So, it took us nearly nine years, but we finally found an example of Halloween culture that is truly scary—because of how dull and boring it is. It is devoid of any true cultural relevance to the holiday.

The best you could hope for is that maybe a black cat walked into the IBM Almaden Research Center on that fateful day in 1976 and gave the database query a fateful stroke of bad luck, but something tells me the vibe of the building is less Hocus Pocus, more Severance.

When I told my wife Cat about this story, she told me that a small part of her died as I explained it, that the idea that something as interesting as Halloween could be associated with something so inherently dull and boring as a database querying error just feels antithetical to the spirit of this costume-friendly hobby. Simply put, you cannot dress up a logic gap in database design to be much more friendly than it already is.

How you feel, knowing that you just read an entire article about relational databases. (Brian Wilkins/Flickr)

That may be the real trick. Whether or not it was the true goal, the Halloween problem, by not actually being about the holiday for which it is named, is far scarier than any man with a knife in a reconstituted William Shatner mask.

It is what happens when hope and terror give way to true drudgery. It is truly something to fear when something borrows an association but absolutely none of the spirit that creates that thing.

It’s enough to make you want to drop a table.

--

Find this one an interesting read? Share it with a pal!

Share this post:

follow on Twitter | privacy policy | advertise with us

Copyright © 2015-2023 Tedium, all rights reserved.

Disclosure: From time to time, we may use affiliate links in our content—but only when it makes sense. Promise.

unsubscribe from this list | view email in browser | sent with Email Octopus

Older messages

Permanent Unicorn 💻

Friday, October 27, 2023

Apple needs to recreate the magic of the M1 MacBook Air every year. Here's a version for your browser. Hunting for the end of the long tail • October 26, 2023 Permanent Unicorn The reason Apple

Enemies At The Gates 🚪

Wednesday, October 25, 2023

Gatekeepers, too often, hold back our potential. Let's fix that. Here's a version for your browser. Hunting for the end of the long tail • October 25, 2023 Enemies At The Gates We're at an

Yes, He Did That 🥸

Thursday, October 19, 2023

Jaleel White was just 12 when he invented Steve Urkel. 12! Here's a version for your browser. Hunting for the end of the long tail • October 14, 2023 Editor's note: So, I'm on vacation for

The Post Of Revelations 🪕

Monday, October 9, 2023

A single post from Sufjan Stevens reshapes his new album. Here's a version for your browser. Hunting for the end of the long tail • October 09, 2023 The Post Of Revelations Sufjan Stevens shares a

Tune-In 📺

Saturday, October 7, 2023

An interview with the mind behind the infamous Turn-On. Here's a version for your browser. Hunting for the end of the long tail • October 06, 2023 Today in Tedium: More than 50 years ago, in the

You Might Also Like

Corporate Casserole 🥘

Monday, November 25, 2024

How marketing and lobbying inspired Thanksgiving traditions. Here's a version for your browser. Hunting for the end of the long tail • November 24, 2024 Hey all, Ernie here with a classic

WP Weekly 221 - Bluesky - WP Assets on CDN, Limit Font Subsets, ACF Pro Now

Monday, November 25, 2024

Read on Website WP Weekly 221 / Bluesky Have you joined Bluesky, like many other WordPress users, a new place for an online social presence? Also in this issue: CrawlWP, Asset Management Framework,

🤳🏻 We Need More High-End Small Phones — Linux Terminal Setup Tips

Sunday, November 24, 2024

Also: Why I Switched From Google Maps to Apple Maps, and More! How-To Geek Logo November 24, 2024 Did You Know Medieval moats didn't just protect castles from invaders approaching over land, but

JSK Daily for Nov 24, 2024

Sunday, November 24, 2024

JSK Daily for Nov 24, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JavaScript Certification Black Friday Offer – Up to 54% Off! Certificates.dev, the trusted

OpenAI's turbulent early years - Sync #494

Sunday, November 24, 2024

Plus: Anthropic and xAI raise billions of dollars; can a fluffy robot replace a living pet; Chinese reasoning model DeepSeek R1; robot-dog runs full marathon; a $12000 surgery to change eye colour ͏ ͏

Daily Coding Problem: Problem #1618 [Easy]

Sunday, November 24, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Zillow. Let's define a "sevenish" number to be one which is either a power

PD#602 How Netflix Built Self-Healing System to Survive Concurrency Bug

Sunday, November 24, 2024

CPUs were dying, the bug was temporarily un-fixable, and they had no viable path forward ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

RD#602 What are React Portals?

Sunday, November 24, 2024

A powerful feature that allows rendering components outside their parent component's DOM hierarchy ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

C#533 What's new in C# 13

Sunday, November 24, 2024

Params collections support, a new Lock type and others ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

⚙️ Smaller but deeper: Writer’s secret weapon to better AI

Sunday, November 24, 2024

November 24, 2024 | Read Online Ian Krietzberg Good morning. I sat down recently with Waseem Alshikh, the co-founder and CTO of enterprise AI firm Writer. Writer recently made waves with the release of