[Last Week in AWS Extras]: An AWS Database Safari

 

In today's newsletter, I take you on a journey through the various database offerings AWS has, and give my thoughts about each of them. There are a lot! You may complain that I didn't hit your favorite, because anything is a database if you hold it wrong. That's okay; I'm sure some unscrupulous database vendor will be along shortly to dry your tears.

 

As always, should you wish to link someone else to this post, you can forward it, or else view it on the web.

 

 

Have you heard about ChaosSearch, the fully managed log analytics platform that leverages your Amazon S3 as a data store? According to the CTO at Armor, a global cybersecurity company with more than 1,000 customers in 42 countries, “ChaosSearch is a critical piece of our infrastructure for processing tens of terabytes per day of our customers’ log data.” And at Hubspot, the Engineering Lead said “We are able to process and analyze 10's of terabytes a day of Cloudflare log data to identify and fend off DDoS attacks on behalf of our customers at a fraction of the cost of our previous self-hosted ELK Stack.” So take it from me, or take it from them - either way, take a look at ChaosSearch today!. Sponsored

 

 

An AWS Database Safari

When we talk about vendor lock-in, one of the most common stories we see is one of databases. The database you pick to hold your data is something you're going to be using for a good long while; migrations are painful, expensive, time-consuming, and—in some cases —barely possible.

 

Amazon themselves ran on top of Oracle for a long time. Despite extreme incentive to get off of it as fast as possible, it took them years and inventing their own database engine to do it successfully.

 

If you take a look at AWS's offerings for databases, you’ll see ... a lot of options. Let's ignore their RDS databases, of which there are many. (There are five database engines to choose from, and their Aurora variants speak two of those engines’ languages fluently. Then we add in two more for Aurora's Serverless options, bringing this thing we're handwaving away to nine database options already.) Why does AWS have so many database options?

 

The simple answer is that different databases support different use cases. After all, "every AWS product is for somebody, and no AWS product is for everybody.”

 

Picking a database is a "one-way door," as Amazonians like to call it. It's painful and annoying to migrate databases even between different versions of the same engine. When you pick a database, you're making a commitment—whether you know it or not.

 

Let's start simple with...

Amazon Redshift

Redshift is an opinionated version of PostgreSQL that's designed for data warehouse projects. To wit, it's imagined that this is used for relational workloads that are likely to hit petabyte scale. Don't let the pricing fool you; you're not going to run just one of these suckers.

Amazon Athena

More cost-effective is Athena. This uses a variant of the Presto engine for running SQL queries against data that lives in S3. It's way, way, way cheaper to store data inside of S3 then it is to shove it onto disks attached to relational data stores, and this even works well for ad-hoc querying, too.

 

The challenge is that the query performance and latency for response is nondeterministic, meaning you may not want to have this hooked up to anything interactive in a web form. As Redshift begins to speak to S3 more effectively, the lines between the previous two database options blur.

SimpleDB

You might think SimpleDB has been discontinued. You'd be wrong. It's not in the AWS console because it never has been. Until a couple of weeks ago, there was an AWS job posting for the SimpleDB team in Chennai. It's still there—and AWS gives every appearance that it's not going anywhere. That said, unless you're already using it, you probably don't want to start. Instead, the best guidance is to instead consider...

DynamoDB

DynamoDB is an interesting take on a NoSQL database. If you know what your queries are going to look like in advance, it's hard to beat. It offers a key-value store but can also masquerade as a document store (more about those in a bit).

 

It's inexpensive when configured properly, its responsiveness is impressive, and you have no compute infrastructure to manage. But you do need to make your peace with the unfortunate fact that since it is proprietary, anything you're using in DynamoDB is unlikely to work anywhere else. A migration off of AWS therefore means you're also rearchitecting your DynamoDB data stores—and the applications that interface with them.

Amazon Neptune

Neptune is a managed graph database, which is an accurate answer that tells you absolutely nothing. Graph databases are great at returning results that highlight relationships. "This user is friends with the following users" is the canonical example, because basically anything else requires 80 pages to explain. My position is, and steadfastly remains, this: "If you need a graph database, you almost certainly know it, and Neptune is on the table; otherwise, move along."

Amazon QLDB

QLDB, or Quantum Ledger Database, is a database engine that arose from the question "What if we needed a blockchain but without all of the hype-driven nonsense that makes blockchains ridiculous for most use cases?"

 

In other words, if you need a ledger-style database but can trust a central authority, QLDB is for you. If you can't trust a central authority, Amazon Managed Blockchain might be a better answer. But let's face it: At that point, you're almost certainly past trusting a cloud provider, aren't you?

Amazon Timestream

While we're delving into the realm of fantasy, let's look at Timestream, their take on a time series database. This isn't an objectively nutty thing to want; my criticism of it comes from the fact that it was announced at re:Invent (AWS’s own version of Cloud Next) 2018, and over a year-and-a-half later, it has yet to enter either public preview or general availability. Time series databases (like InfluxDB) are great at displaying data over time. Metrics, logs, events—and frequently at incredibly high volume. This is a big deal not just in application monitoring but also in the world of IoT. Devices in the field reporting vast quantities of data back to a central point will often look for something in the time series space.

Amazon ElastiCache

ElastiCache has two variants (Redis and Memcached), both of which serve as an in-memory database. This means incredibly quick response times are available, since the query never has to touch the disk. This is thus generally used for keeping session data around.

 

A very common use case is having multiple web servers behind a load balancer, but sharing the session data so that users don't have to log in again every time the load balancer gives them to a different server. The risk, of course, is that since the data isn't persisted to disk, you're one power outage away from data loss. Speaking of data loss...

Amazon DocumentDB (with MongoDB compatibility)

MongoDB is a storied database that achieves some great things. Unfortunately, it also likes to emphasize performance at the potential cost of data integrity and historically has done so by burying some very important caveats deep within their documentation.

 

That said, what makes DocumentDB interesting to me is that AWS does no real marketing of the service past talking about how compatible with MongoDB 3.6 it is. My takeaway from that message is this: "If you want to run MongoDB in an AWS environment, consider this." Based upon MongoDB's community interactions, I don't want to run it at all, so I don't spend much time paying attention to DocumentDB, either. If you're less judgmental of MongoDB than I am, this is worth a gander.

Amazon Keyspaces (for Apache Cassandra)

Similarly, the AWS documentation for Keyspaces falls far short. It spends most of its energy talking about how compatible with Cassandra it is—albeit in ways that sound suspiciously like DynamoDB.

 

It's worth noting that one of the Dynamo paper authors went on to build Cassandra later in time. If I wanted to use a managed service but still have a theoretical database exodus strategy I could fall back to, I'd consider this as my first stop on the path.

Amazon Route 53

Lastly, we come to my favorite database: Route 53.

 

You might argue that DNS isn't a database; I would argue that an eventually consistent world-spanning key-value store that (in this case) offers a 100% uptime SLA is awfully hard to see as anything other than a database. I will accept that I'm in the minority opinion (for now!), but I will highlight that an awful lot of what people are misusing S3 for could just as easily be done well with Route 53.

 

This concludes my survey through AWS's database offerings. Unfortunately, this will rapidly go out of date; there are multiple job postings on AWS's site looking for people to work on "unreleased database products," so it's pretty clear that I may have to revisit this topic after re:Invent (AWS’s own version of Cloud Next) unless I want to watch it age badly.

 

 

Definitive Guide to AWS EKS Security - Download eBook

When using Amazon’s Elastic Kubernetes Service (EKS), you must understand which pieces of the security management role fall on you. Use this 42-page eBook from StackRox to learn about EKS cluster security, including the standard controls and best practices for minimizing the risk around cluster workloads, as well as specific requirements for securing an EKS cluster and its associated infrastructure. Sponsored

 
 
 
Corey

I’m Corey Quinn

I help companies address their horrifying AWS bills by both reducing the dollars spent and helping them understanding what they’re paying for.

 
 
The Cloud

Screaming in the Cloud & AWS Morning Brief

In addition to this newsletter, I host two podcasts: Screaming in the Cloud, about the business of cloud computing, featuring me talking to folks who are good at things; and AWS Morning Brief, a show about exclusively AWS with my snark at full-tilt.

 
 
The Cloud

Sponsor an Issue

Reach over 19,000 discerning engineers, managers, and enthusiasts who actually care about the state of Amazon's cloud ecosystems.

 



Want to skip these Last Week in AWS Extras? .

To make sure you keep getting these emails, please add corey@lastweekinaws.com to your address book or otherwise mark me as a permitted sender.

Want out of the loop completely? to tell me to leave you alone.

 

Duckbill Group

1728 Ocean Ave #307, San Francisco, CA 94112

 
                                                           

Older messages

[Last Week in AWS] Issue #165: AWS Graviton2 Clock Speeds Broadly Non-Competitive

Monday, June 15, 2020

Good Morning! Welcome to issue 166 of Last Week in AWS. Last week it came out that AWS is suing their former VP of Product Marketing under its incredibly-broad non-compete agreement. My thoughts on non

[Last Week in AWS Extras]: AWS Ruins Own Attempt at Sabotage

Wednesday, June 10, 2020

One of these weeks I'll get to cover the kind of cloud analysis I really enjoy, but last week we had racial justice issues that clearly took precedence, and this week Amazon apparently can't

[Last Week in AWS] Issue #165: Enduring the Cloud Migration Factory

Monday, June 8, 2020

Good Morning! Since my last email, folks have collectively donated tens of thousands of dollars to support racial justice. You really are the best audience in the world. Obviously this is only a start.

[Last Week in AWS Extras]: Snarking For Racial Justice

Wednesday, June 3, 2020

Snarking for Racial Justice This was originally going to be a wonderfully crafted treatise on why multi-cloud is nonsense. But if we look outside in the United States, or in the headlines of any paper

[Last Week in AWS] Issue #164: AWS Security Landscapers

Monday, June 1, 2020

Good Morning! This is the 164th issue of Last Week in AWS, but that feels like a hollow observation against the backdrop of the uprising we have seen developing in the United States (this

You Might Also Like

Charted | Global Economic Confidence in 2025, by Country 🌎

Wednesday, December 25, 2024

While emerging markets in Asia have the strongest confidence in the global economy looking ahead, European countries are most pessimistic. View Online | Subscribe | Download Our App FEATURED STORY

Top Tech Deals 🎅 Sony Headphones, iPhone Cases, 4K Projector, and More!

Wednesday, December 25, 2024

The season of giving is upon us. How-To Geek Logo December 25, 2024 Top Tech Deals: Sony Headphones, iPhone Cases, 4K Projector, and More! The season of giving is upon us. Happy Holidays! If you're

Why the Race to AGI is Humanitys Defining Moment

Wednesday, December 25, 2024

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, December 25, 2024? The

Iran's Charming Kitten Deploys BellaCPP: A New C++ Variant of BellaCiao Malware

Wednesday, December 25, 2024

THN Daily Updates Newsletter cover The Data Science Handbook, 2nd Edition ($60.00 Value) FREE for a Limited Time Practical, accessible guide to becoming a data scientist, updated to include the latest

Software Testing Weekly - Issue 251

Wednesday, December 25, 2024

GitHub Copilot is free! 🤖 View on the Web Archives ISSUE 251 December 25th 2024 COMMENT Welcome to the 251st issue! In case you missed it — GitHub Copilot is free! The free version works with Visual

Daily Coding Problem: Problem #1647 [Medium]

Tuesday, December 24, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Square. In front of you is a row of N coins, with values v 1 , v 1 , ..., v n . You are

Sentiment Analysis, Topological Sort, Web Security, and More

Tuesday, December 24, 2024

Exploring Modern Sentiment Analysis Approaches in Python #661 – DECEMBER 24, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Exploring Modern Sentiment Analysis Approaches in Python What are the

🤫 Do Not Disturb Mode Is My Secret to Sanity — 8 Gadgets I Want To See Nintendo Make

Tuesday, December 24, 2024

Also: The Best Christmas Movies to Watch on Netflix, and More! How-To Geek Logo December 24, 2024 Did You Know Their association with the Christmas season might make you think poinsettias hail from a

😱 AzureEdge.net DNS Retiring Jan. 2025, 🚀 Microsoft Phi-4 AI Outperforms, 🔒 Microsoft Secure Future Initiative

Tuesday, December 24, 2024

Blog | Advertise | View Online Your trusted source for Cloud, AI and DevOps guidance with industry expert Chris Pietschmann! Phi-4: Microsoft's New Small Language Model Outperforms Giants in AI

Mapped | The Top Health Insurance Companies by State 🏥

Tuesday, December 24, 2024

In 13 US states, a single company dominates the health insurance market, holding at least half of the total market share. View Online | Subscribe | Download Our App Presented by: Global X ETFs Power