The Five Important Trends in Data, and the One Megatrend Powering Them All

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here.

The Five Important Trends in Data, and the One Megatrend Powering Them All

Jul 30, 2020 05:00 pm

Yesterday, Dremio hosted the Subsurface Conference, the first conference on cloud data lakes. More than 5000 people registered, and more than 2500 attended. If one had doubts that cloud data lakes are a strategic area for many in the data ecosystem, those figures should quash them.

I delivered a presentation at the end of the day that I’ll share here. Entitled 5 Data Trends You Should Know, the presentation covers the major trends we observe in the data world. Here’s a quick narrative of the talk.

There is a mega-trend underpinning the changes in data design philosophy and tooling: the rise of the data engineer. Data engineers are the people who move, shape, and transform data from the source to the tools that extract insight. We believe data engineers are the change agents in a decade-long process that will revolutionize data.

Data systems used to be purchased by IT. But in the last 20 years, individual departments started to purchase their own data systems. Each team, using their data systems, develops their proprietary data products: analyses, dashboards, machine learning systems, even new product features.

Data systems rely on data from other teams. So all of these teams share data. And just like that, the company has built a data mesh: a network of producers and consumers of data who share data via standard APIs or open-source formats like Apache Arrow & Parquet. When the data is stored in the cloud, we call it a cloud data lake.

image

Data engineers stand on the shoulders of 70 years of software development experience and take many of the learnings from that discipline. One example is developing a data engineering lifecycle. This is our current understanding of a typical data engineering software development lifecycle.

There are six steps:

  1. Ingesting data from the systems that produce it and writing it into open formats in the cloud
  2. Planning the software to build
  3. Querying data using a compute engine which runs across the cloud data lake
  4. Modeling the data to ensure there is one centralized definition of every metric with an owner, a lineage, and a status
  5. Developing the data product which could be analyses, BI reports, machine learning models, production features
  6. Monitoring and testing the data to ensure data consistency & integrity over time

As the profession of data engineering matures, engineers need new tools to help them with each step in the process. The five trends that we are observing within the data world are the rise of those tools at each step. Here are those 5:

  1. New data pipelines that use modern computer languages to create reusable abstractions for data processing, to monitor data pipelines, and to visualize the flow of data, the DAG (directed acyclic graph). Innovators here are Dagster, Airflow, and Prefect.
  2. Compute engines query data in the cloud without having to move it. They leverage the separation of data and compute to accelerate queries, enable secure and compliant access, .and future proof the infrastructure to new advances in tools and use cases which you haven’t been built. Innovators are Dremio and Databricks.
  3. Data modeling curates a data catalog for all the metrics within a company. When metrics are modeled, they are defined once, accurately, and everyone uses that definition. Innovators are Transform Data and Looker (with LookML).
  4. Data products are analyses, experiments, reports, and machine learning models/products built on data. Innovators in this category include Preset, Streamlit, and Tecton among others.
  5. Data quality tools monitor data streams, identify anomalies, create testing harnesses to ensure data is always accurate. Data quality innovators include MonteCarlo, SodaData, Great Expectations, and Data Gravity.

All the tools need to be synthesized to achieve the vision of a modern data match, and data engineers will pioneer that change.


Read in browser »
share on Twitter Like The Five Important Trends in Data, and the One Megatrend Powering Them All on Facebook


 

Recent Articles:

How to Recruit a Marketing Team with Great Product Marketing and Demand Generation Abilities
The Best Economic History Books According to Readers
The Unforeseen Benefits of Online Events
An Economic History of the US in Five Stock Market Crashes
What I've Learned about Modern Monetary Theory
Copyright © 2020 *|Tomasz Tunguz|*, All rights reserved.
You signed up to receive Ex Post Facto blog posts by submitting your email on tomtunguz.com

Our mailing address is:
Redpoint Ventures
3000 Sand Hill Rd
Menlo Park, CA 94025

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.
 

Older messages

How to Recruit a Marketing Team with Great Product Marketing and Demand Generation Abilities

Monday, July 27, 2020

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. How to Recruit a Marketing Team with Great Product Marketing and Demand Generation Abilities Jul

The Best Economic History Books According to Readers

Thursday, July 23, 2020

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Best Economic History Books According to Readers Jul 20, 2020 05:00 pm Thanks to everyone who

The Unforeseen Benefits of Online Events

Monday, July 20, 2020

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Unforeseen Benefits of Online Events Jul 18, 2020 05:00 pm In this era, virtual events have

An Economic History of the US in Five Stock Market Crashes

Friday, July 17, 2020

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. An Economic History of the US in Five Stock Market Crashes Jul 16, 2020 05:00 pm I've been

What I've Learned about Modern Monetary Theory

Monday, July 13, 2020

If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. What I've Learned about Modern Monetary Theory Jul 12, 2020 05:00 pm There's a relatively

You Might Also Like

Peppered Kitty and The Penal Guard 💂‍♂️

Tuesday, November 12, 2024

The breed of the non-human͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌ ͏‌

🗞 What's New: HARO/Connectively is shutting down

Tuesday, November 12, 2024

Also: Use AI to beef up your security ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

the wizard of oz.

Tuesday, November 12, 2024

Read time: 53 sec. Today I want to tell you about Cristiano. He was part of our last Starter Story Academy sprint. And during his first two weeks, he was busy designing and tweaking his landing page.

💃 Beyoncé loves her products...here’s how she did it

Tuesday, November 12, 2024

The exact steps to build your beauty brand empire Hey Friend , We just launched our newest course, How to Build a Million Dollar Beauty Brand. In it, for the first time, Alicia Scott—founder of Range

[CEI] Chrome Extension Ideas #166

Tuesday, November 12, 2024

ideas for Amazon, Twitter, Developers, and Students ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Navattic's PLG funnel with Natalie Marcotullio

Tuesday, November 12, 2024

In conversation with Navattic's Head of Growth about their product-led growth (PLG) funnel. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

You have one shot to sell your business 🤞

Tuesday, November 12, 2024

Just One Week to Go Until Exit Strategy Launches! ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Product manager is an unfair role. So work unfairly.

Tuesday, November 12, 2024

How to thrive in “the great flattening” by redefining work norms ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Growth Newsletter #223

Tuesday, November 12, 2024

It's not "what" but "where" ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

All stock, 6-figure deal

Tuesday, November 12, 2024

Plus, overcome a big barrier to exit planning: owner dependency ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏