Beta
Logo of the podcast The Analytics Engineering Podcast

The Analytics Engineering Podcast (dbt Labs, Inc.)

Explorez tous les épisodes de The Analytics Engineering Podcast

Plongez dans la liste complète des épisodes de The Analytics Engineering Podcast. Chaque épisode est catalogué accompagné de descriptions détaillées, ce qui facilite la recherche et l'exploration de sujets spécifiques. Suivez tous les épisodes de votre podcast préféré et ne manquez aucun contenu pertinent.

Rows per page:

1–50 of 73

DateTitreDurée
22 Apr 2022Automating Away Your Work w/ Configuration-as-Code (w/ Sarah Krasnik)00:43:52

Most recently leading a data engineering team at Perpay, Sarah has built and managed data platforms end to end by working closely with internal engineering, product, and operational teams. She recently left her role to pursue a wide variety of endeavors, including writing on her Substack (https://sarahsnewsletter.substack.com/).

In this conversation with Tristan and Julia, Sarah dives into how configuration-as-code can automate away data work, why you might want to consider adding a data lake to your architecture, and how those looking to build a self-serve data culture can look to self-serve frozen yogurt shops for inspiration.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

09 Jun 2024The rapid experimentation of AI agents (w/ Yohei Nakajima)00:45:55

Yohei Nakajima is an investor by day and coder by night. In particular, one of his projects, an AI agent framework called BabyAGI that creates a plan-execute loop, got a ton of attention in the past year.

The truth is that AI agents are an extremely experimental space, and depending on how strict you want to be with your definition, there aren't a lot of production use cases today. 

Yohei discusses the current state of AI agents and where they might take us. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

21 Apr 2023It's 2023, and Privacy Is Now Fun! (w/ Ian Coe of Tonic.ai + Abhishek Bhowmick of Samooha)00:47:39

Advances in ML have transformed data privacy from a regulatory necessity into an opportunity to improve the work of data people.

Synthetic data for modeling + testing is one example of a hard thing that's now easy - and in this conversation with Tristan and Julia, Ian + Abhishek cover many other ways that privacy can actually be a skill that propels your work forward, rather than a mere legal best practice.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

25 Feb 2022Ashley Sherwood (AE @ Hubspot): Permissionless Innovation for Data Teams00:45:33
Ashley is a Principal Analytics Engineer at Hubspot, and has helped lead their implementation of dbt.

Ashley makes unique connections in her writing and work. On her Substack, "syntax error at or near ❤️," Ashley might be found comparing growing companies to butterflies, or going deep on how to accommodate sensitive people in the workplace.

In this conversation with Tristan & Julia, Ashley dives into the nuts and bolts of her trajectory pushing data innovation forward at Hubspot.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

08 Dec 2023Data Mesh Architecture at Large Enterprises (w/ Moritz Heimpel and Ben Flusberg)00:46:07

Moritz Heimpel from Siemens and Ben Flusberg from Cox Automotive have very similar jobs. They both act as stewards of the data strategies at large, complex companies.

In this episode, we get into what it’s like to collaborate with data at scale. Ben and Mortitz share their experiences adopting a data mesh architecture and what that looks like at their organizations.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

11 Mar 2022One Database to Rule All Workloads? With Jon "Natty" Natkins of dbt Labs00:36:24

Will the dream of a mythical database to handle all workloads (transactional + analytical) ever become a reality, or does it violate the laws of physics?

This question sparked a hearty debate internally at dbt Labs, and Jon "Natty" Natkins joins Julia here to continue the conversation.

Natty knows databases, and this episode will take you on a historical romp through the rise and fall of Hadoop, the transition to cloud data warehouses, and what's waiting for us next in database-land.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

10 Mar 2023dbt Labs + Transform Join Forces on Metrics (w/ Nick Handel + Drew Banin)00:43:09

Nick Handel, as co-founder at Transform, helped develop the popular open source metrics framework MetricFlow. Drew Banin, a co-founder at dbt Labs, helped build the initial version of the dbt Semantic Layer, which launched last year.  

Transform was acquired in February by dbt Labs, and in this conversation with Tristan, they talk through their collective plans for the future of the dbt Semantic Layer.

For full show notes and to read 7+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

15 Jul 2022Data Activation Everywhere (w/ Julie Beynon of Clearbit)00:43:30

As Head of Analytics at Clearbit, Julie serves as a data team of one in a 200+ person company (wow!).

In this conversation with Tristan and Julia, Julie dives into how she's helped Clearbit implement data activation throughout the business, and realize the glorious dream of self-serve analytics.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

06 Jan 2023What Does Apache Arrow Unlock for Analytics? (w/ Wes McKinney)00:47:08

Wes McKinney is the creator of pandas, co-creator of Apache Arrow, and now Co-founder/CTO at Voltron Data.

In this conversation with Tristan and Julia, Wes takes us on a tour of the underlying guts, from hardware to data formats, of the data ecosystem.

What innovations, down to the hardware level, will stack to lead to significantly better performance for analytics workloads in the coming years?

To dig deeper on the Apache Arrow ecosystem, check out replays from their recent conference at https://thedatathread.com.

For full show notes and to read 7+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

29 Jul 2022Katie Bauer: Data Scientists Are Not Pizza00:43:25

Katie was a founding member of Reddit's data science team and, currently, as Twitter’s Data Science Manager, she leads the company’s infrastructure data science and analytics organization.

In this conversation with Tristan and Julia, Katie explores how, as a manager, to help data people (especially those new to the field!) do their best work.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

18 Nov 2021David Jayatillake: Should Great Data People Become Managers or Not?00:41:25

David is Sr. Director of Data at Lyst, and as leader of their analytics + data science teams he has followed the evolution of data roles closely over the past decade.

David spends a lot of time thinking about career progression + data team structure, and in this conversation with Tristan + Julia they dive into the classic individual contributor vs manager conundrum, migrating between warehouses, and reactive vs proactive data workflows.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. 

The Analytics Engineering Podcast is sponsored by dbt Labs.

25 Mar 2022The Bundling vs Unbundling Debate w/ Tristan, Benn Stancil and David Jayatillake00:43:29

A debate has erupted on data Twitter and data Substack - should the modern data stack remain unbundled, or should it consolidate?

In this conversation, Benn Stancil (Mode), David Jayatillake (Avora) and our host Tristan Handy try to make some sense of this debate, and play with various future scenarios for the modern data stack. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

24 Feb 2023What Can Generative AI Do for Data People? (W/ Sarah Nagy + Chris Aberger)00:48:29

Sarah and Chris are both at the forefront of bringing the promise of gen AI to our actual work as data people—which is a unique challenge!  Precise truth is critical for business questions in a way that it’s not for a consumer search query.

Sarah Nagy is the CEO of Seek AI, a startup that aims to use natural language processing to change how professionals work with data.

Chris Aberger currently leads Numbers Station AI, a startup focused on data-intensive workflow automation.

In this conversation with Tristan and Julia, they dive into what this future might actually look like, and tangibly what we can expect from gen AI in the short/medium term.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

11 Aug 2023Ramp's $8 Billion Data Strategy (W/ Ian Macomber and Ryan Delgado)00:49:19

Ian Macomber, head of analytics engineering and data science at Ramp and formerly the VP of analytics and data engineering at Drizly, and Ryan Delgado, a staff software engineer at Ramp, have played pivotal roles in establishing Ramp's data team from the ground up and are spearheading the development of their comprehensive roadmap.

In this conversation with Tristan and Julia, Ian and Ryan share insights on how Ramp's data team transformed unstructured data from contracts into valuable insights to enable faster decision-making. The $8 billion company values speed and empowers teams to build, ship, and measure products quickly. Ian and Ryan also talked about their approach to adopting new tech and elevating data as an equal player alongside product engineering and design.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

04 Nov 2022How Does Data Drive Growth in Practice? (w/ Abhi Sivasailam)00:49:53

Abhi is a growth and data leader, and an excellent Twitter follow. Most recently, he was Head of Growth and Analytics at Flexport, where he helped the company to grow 10x over the past 3 years. Previously, Abhi led growth and data teams at Keap, Hustle, and Honeybook.

In this conversation with Tristan and Julia, Abhi explains his methodology for setting up a new growth data organization, and how you might be falling victim to the dreaded "arbitrary uniqueness" bug.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs

10 Dec 2021[COALESCE] Down With "Data Science" w/ Emilie Schario of Amplify Partners00:45:55

Your company has one definition for revenue across the organization, one definition of the customer, and one definition of sign-up. For people whose jobs are so defined by ensuring we’re aligned, we can’t seem to standardize on one definition for the Data Scientist.

In this talk, Emilie Schario (Data Strategist-in-Residence at Amplify Partners and longtime dbt community member) proposes we lobby against the title Data Scientist, instead choosing some variation of the Core Four Data Roles: Data Analyst, Analytics Engineer, Data Engineer, and Machine Learning Engineer.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

25 Feb 2024The End of the Modern Data Stack (w/ Benn Stancil, Mode)00:45:46

Benn Stancil, cofounder and CTO at Mode, returns to The Analytics Engineering Podcast to discuss the evolution of the term "modern data stack" and its value today. Tristan wrote on this idea for The Analytics Engineering Roundup in Is the Modern Data Stack Still a Useful Idea?

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

12 Aug 2021Meet Co-Host Julia Schottenstein00:32:22

In this episode, we're going to do something a little different, and turn the spotlight on co-host Julia Schottenstein.

In this conversation with Tristan, you'll get to know Julia a bit—from her early childhood ambitions of becoming a "computer tycoon" (adorable!), to working in venture at NEA and now as a Product Manager at dbt Labs.

They also dive into Julia's opinions on key trends shaping the future of the data industry (the phrase oligopoly makes an appearance).

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

23 Sep 2021Brittany Bennett: Training the Next Generation of ‘Data for Good’ Practitioners @ Sunrise Movement00:38:34

Brittany Bennett is Data Director at Sunrise Movement, the youth climate movement that numbers tens of thousands of members throughout every US state. 

Given how quickly our industry moves, developing junior data talent is hard, but Brittany’s team at Sunrise makes it look easy. And that’s no accident—because Sunrise hires for mission alignment rather than technical background, they dedicate significant resources to training + mentorship.

In this conversation, Tristan, Julia & Brittany dive deep into the opportunity of developing junior data practitioners.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

22 Sep 2024Creating value from GenAI in the enterprise (w/ Nisha Paliwal)00:45:20

Nisha Paliwal, who leads enterprise data tech at Capital One, joins Tristan to discuss building a strong data culture for in the world of AI. She is the co-author of the book Secrets of AI Value Creation. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

09 Dec 2021[COALESCE] The Modern Data Experience w/ Benn Stancil of Mode00:30:29

In this talk, former podcast guest Benn Stancil walks through what he believe the next evolution of the modern data stack should look like - and more importantly, how those who use it should experience it.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

26 May 2024Funnel analytics and AI models for event sequences (w/ Misha Panko)00:44:09

Misha Panko has worked in data for a long time, including on high performance data teams at Uber and Google. Today, Misha is the co-founder and CEO of Motif Analytics, a product focused on helping growth and ops teams understand their event data.

In this episode, Tristan and Misha nerd out about the state of the art in computational neuroscience, where Misha got his PhD. They then go deep into event stream data and how it differs from classical fact and dimension data, and why it needs different analytical tools.

Make sure to check out the back half of the episode, where they dive into AI and how Motif is applying breakthroughs in language modeling to train foundation models of event sequences—check out his team’s blog post on their work.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

20 Oct 2023Career Growth in Data Roles (w/ Hubspot's Kasey Mazza at Coalesce 2023)00:29:17

In this conversation with Tristan recorded at Coalesce 2023, Kasey Mazza, an analytics engineering manager on the RevOps team at HubSpot, discusses the roles of data analysts and analytics engineers, the importance of building internal data communities, and the evolving landscape of data teams. 

Watch Kasey’s Coalescse 2023 presentation The career growth software development lifecycle.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

08 Apr 2022The Hard Problems™️ of Data Observability w/ Kevin Hu of Metaplane00:43:10

As a PhD candidate at MIT, Kevin (and friends) published Sherlock, a data type detection engine (a surprisingly bedeviling problem) for data cleaning + data discovery.

Now as co-founder and CEO of Metaplane, a data observability startup, Kevin applies these same automated data discovery methods to help data teams keep their data healthy.

In this conversation with Tristan & Julia, Kevin wins the coveted award for “most crystal-clear explanations of complex technical concepts through physics analogy.”  

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

06 Oct 2023Operationalizing Your Warehouse, Streaming Analytics, and Cereal (W/ Arjun Narayan of Materialize and Nathan Bean of General Mills)00:42:23

It turns out data plays a big role in getting cereal manufactured and delivered so you can enjoy your Cheerios reliably for breakfast. We talk with Arjun Narayan, CEO of Materialize, a company building an operational warehouse, and Nathan Bean, a data leader at General Mills responsible for all of the company's manufacturing analytics and insights. 

We discuss Materialize’s founding story, how streaming technology has matured, and how exactly companies are leveraging their warehouse to operationalize their business—in this case, at one of the largest consumer product companies in the United States. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

17 Jun 2022Making Sense of the Last 2 Years in Data00:47:13

Matt Bornstein and Jennifer Li (and their co-author Martin Casado) of a16z have compiled arguably the most nuanced diagram of the data ecosystem ever made. 

They recently refreshed their classic 2020 post, "Emerging Architectures for Modern Data Infrastructure" and in this conversation, Tristan attempts to pin down: what does all of this innovation in tooling mean for data people + the work we're capable of doing? When will the glorious future come to our laptops?

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

17 Dec 2021Tristan in the Hot Seat00:38:49

In this very special episode, we’ll be turning the spotlight on co-host Tristan Handy, the CEO & Co-founder of dbt Labs.

In this AMA with Julia, you’ll get to know more about Tristan as a human, as a writer, and as the CEO of dbt Labs helping to push the analytics engineering practice forward. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. 

08 Sep 2024Developer productivity on GitHub Copilot (w/ Eirini Kalliamvakou)00:53:59

Dr. Eirini Kalliamvakou is a senior researcher at GitHub Next. Eirini has built a career on studying software engineers, how to measure their productivity, how developer experience impacts productivity, and more.

Recently, Eirini has been working on quantifying the impacts of GitHub Copilot. Does it actually help software engineers be more productive? Tristan and Eirini explore how to quantify developer productivity in the first place, and finally, arriving at whether or not Copilot‌ makes a difference. In the search for real business value, this research is a real bellwether of things to come.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

Join data practitioners and data leaders this October in Las Vegas at Coalesce, the analytics engineering conference hosted by dbt Labs. Register now at coalesece.getdbt.com. Listeners of this show can use the code podcast20 for a 20% discount.

15 Jul 2021Venkat Venkataramani: The Future is Real-time00:45:20

Step with Venkat into a world where data is always fresh, queries run in 1ms, and analytics engineers build web-scale, real-time data apps.

As Engineering Director at Facebook, Venkat helped build the RocksDB real-time database that powered growth to 5 billion queries per second(!)—and now with his colleagues at Rockset, he's bringing that real-time database infrastructure to the rest of us.

In this conversation, Tristan, Julia and Venkat explore the fundamental technological advances that are empowering analytics engineers to enter the real-time future.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

06 May 2022What’s The Role Of AI in BI?00:44:43

Amit Prakash is Co-founder and CTO at ThoughtSpot. He has a deep background in search, having previously led the AdSense engineering team at Google and served on the early Bing team at Microsoft.

In this conversation with Tristan and Julia, Amit gets real about the promise of AI in data: which applications are being widely used today, and which are still a few years out?

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

16 Dec 2022Minimum Viable Experimentation00:45:37

Product experimentation is full of potholes for companies of any size, given the number of pieces (tooling, culture, process, persistence) that need to come together to be successful.

Vijaye Raji (currently Statsig, formerly Facebook + Microsoft) and Sean Taylor (currently Motif Analytics, formerly Facebook + Lyft) have navigated these failure modes, and are here to help you (hopefully) do the same.

This convo with Tristan + Julia is light on tooling + heavy on process: how to watch out for spillover effects in experiments, avoiding bias, how to run an experiment review, and why experiment throughput is a better indicator of success than individual experiment results.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

26 Aug 2021Erik Bernhardsson: The Missing Tool in the Data Team’s Toolbox00:42:07

Erik Bernhardsson spent six years at Spotify, where he contributed to the first version of the music recommendation system. After a stint as CTO at Better.com, he’s now working on building new infrastructure tooling for data teams.

In this wide-ranging conversation with Tristan & Julia, Erik dives into the nuts and bolts of Spotify’s recommendation algorithm, (paradoxically) why you should rarely need to use ML, and the fundamental infrastructure challenges that drag down the productivity of data teams.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

08 Dec 2024Making data movement as reliable as electricity (w/ Taylor Brown)00:46:40

Fivetran recently passed $300 million ARR and has over 7,000 customers globally. Taylor Brown, the cofounder and COO of Fivetran, joins the show to talk about Fivetran’s moat, the impact of AI on the data ingestion space, and open table formats and catalogs. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

18 Nov 2022Why You'll Need Data Contracts (w/ Chad Sanderson + Prukalpa)00:48:42

WARNING: This episode contains detailed discussion of data contracts.

The modern data stack introduces challenges in terms of collaboration between data producers and consumers. How might we solve them to ultimately build trust in data quality?

Chad Sanderson leads the data platform team at Convoy, a late-stage series-E freight technology startup. He manages everything from instrumentation and data ingestion to ETL, in addition to the metrics layer, experimentation software and ML. 

Prukalpa Sankar is a co-founder of Atlan, where she develops products that enable improved collaboration between diverse users like businesses, analysts, and engineers, creating higher efficiency and agility in data projects. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

03 Nov 2024The data jobs to be done (w/ Erik Bernhardsson)00:42:55

Erik Bernhardsson, the CEO and co-founder of Modal Labs, joins Tristan to talk about Gen AI, the lack of GPUs, the future of cloud computing, and egress fees. They also discuss whether the job title of data engineer is something we should want more or less of in the future. Erik’s not afraid of a spicy take, so this is a fun one. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

07 Apr 2024The 2024 Machine Learning, AI & Data Landscape (w/ Matt Turck)00:36:22

Matt Turck has been publishing his ecosystem map since 2012. It was first called the Big Data Landscape. Now it’s the Machine Learning, AI & Data (MAD) Landscape

The 2024 MAD Landscape includes 2,011(!) logos, which Matt attributes first a data infrastructure cycle and now an ML/AI cycle. As Matt writes, “Those two waves are intimately related. A core idea of the MAD Landscape every year has been to show the symbiotic relationship between data infrastructure, analytics/BI,  ML/AI, and applications.”

Matt and Tristan discuss themes in Matt's post: generative AI’s impact on data analytics, the modern AI stack compared to the modern data stack, and Databricks vs. Snowflake (plus Microsoft Fabric).

For full show notes and to read 7+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

03 Nov 2023Navigating AI Complexity (w/ Jonathan Frankle)00:46:20

Jonathan Frankle is the Chief Scientist at MosaicML, which was recently bought by Databricks for $1.3 billion. 

MosaicML helps customers train generative AI models on their data. Lots of companies are excited about gen AI, and the hope is that their company data and information will be what sets them apart from the competition. 

In this conversation with Tristan and Julia, Jonathan discusses a potential future where you can train specialized, purpose-built models, the future of MosaicML inside of Databricks, and the importance of responsible AI practices.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

26 Jan 2025Building a data team from the beginning (w/ Daniel Avancini)00:50:12

Daniel Avancini is the chief data officer and co-founder of Indicium—a fast-growing data consultancy started in Brazil. 

There are a lot of data consultancies around the world, and a lot of them do great work. What has been so fascinating about Indicium’s journey is their HR model. Rather than primarily hiring experienced professionals, they decided to go hard on training. They built a talent pipeline with courses and an internal onboarding process that takes new employees from zero to 60 over a few months.

The result has been phenomenal and Indicium delivers great client outcomes, but most importantly, they're building skills for hundreds of brand new data professionals.

Data is a hard field to break into because fundamentally you can't do the real thing unless you have access to data. So any company that is investing in building scalable hiring and training processes for analytical talent is one to be excited about.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

07 Dec 2021[COALESCE] You Don’t Need Another Database W/ Reynold Xin of Databricks and Drew Banin of dbt Labs00:30:09

Reynold Xin is a technical co-founder and Chief Architect at Databricks. He’s also a co-creator and the top contributor to the Apache Spark project.

In this casual conversation with Drew Banin, co-founder and Chief Product Officer at dbt Labs, the two will be discussing the data infrastructure trends they find most interesting.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

22 Sep 2023Roche’s Data Transformation Journey (w/ Yannick Misteli)00:40:05

Yannick Misteli is the head of engineering for the go-to-market domain at Roche, a $250 billion multinational pharmaceutical and diagnostics company. 

Roche was an early supporter of dbt Cloud, and Yannick helped move his team of 120+ engineers to a modern data stack. He always finds a way to push the boundaries to make a large company founded in 1896 incredibly modern and innovative. We wanted to know more about the "how" of the work—the people, process, and technology. 

Read more about Roche's data journey here: https://docs.getdbt.com/blog/dbt-squared

17 Nov 2024Data as an assembly line (w/ Cedric Chin)00:51:11

Cedric Chin runs Commoncog—a publication about accelerating business expertise. He joins Tristan to talk about the analytics development lifecycle, how organizations value (or misvalue) data, and why “data teams are not some IT helpdesk to be ignored.”  

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

22 Dec 2024The intersection of UI, exploratory data analysis, and SQL (w/ Hamilton Ulmer)00:50:37

Hamilton Ulmer is working at the intersection of UI, Exploratory Data Analysis, and SQL at MotherDuck, and he's built a long career in EDA. Hamilton and Tristan dive deep into the history of exploratory data analysis. Even if you spend most of your time below the frontend layer of the stack, it is important to understand the trends in both the practice of data visualization  and the technologies that underlie that practice.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

06 Oct 2024The current state of the AI ecosystem (w/ Julia Schottenstein)00:45:44

Former co-host Julia Schottenstein returns to the show to go deep into the world of LLMs. Julia joined LangChain as an early employee, in Tristan’s words, to “Basically solve all of the problems that aren't specifically in product and engineering.” LangChain has become one of, if not the primary frameworks for developing applications using large language models. There are over a million developers using LangChain today, building everything from prototypes to production AI applications.

24 Mar 2023Cloud Warehouse Cost Optimization (w/ Niall Woodward + Brad Culberson)00:45:54

Brad Culberson is a Principal Architect in the Field CTO’s office at Snowflake.

Niall Woodward is a co-founder of SELECT, a startup providing optimization and spend management software for Snowflake customers.

In this conversation with Tristan and Julia, Brad and Niall discuss all things cost optimization: cloud vs on-prem, measuring ROI, and tactical ways to get more out of your budget.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

01 Jul 2022The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)00:51:39

Jordan Tigani is an expert in large-scale data processing, having spent a decade+ in the development and growth of BigQuery, and later SingleStore.

Today, Jordan and his team at MotherDuck are in the early days of working on commercial applications for the open source DuckDB OLAP database.

In this conversation with Tristan and Julia, Jordan dives into the origin story of BigQuery, why he thinks we should do away with the concept of working in files, and how truly performant “data apps” will require bringing data to an end user’s machine (rather than requiring them to query a warehouse directly).

12 Jul 2023The Arc of Data Innovation (w/ Bob Muglia, former CEO of Snowflake)00:47:59

Bob Muglia likely needs no introduction. The former CEO of Snowflake led the company during its early, transformational years after a long career at Microsoft and Juniper. 

Bob recently released the book The Datapreneurs about the arc of innovation in the data industry, starting with the first relational databases all the way to the present craze of LLMs and beyond.

In this conversation with Tristan and Julia, Bob shares insights into the future of data engineering and its potential business impact while offering a glimpse into his professional journey. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

07 Dec 2021[COALESCE] How big is this wave? Ft. Martin Casado of a16z00:44:40

The modern data stack is the third generation of data analysis products to come to prominence since the 90's. The prior waves—data warehouse appliances and then Hadoop—were both big steps forwards but ultimately failed to live up to their initial promise.

Is the modern data stack just another iteration in a long string of “trendy technologies” in data––waves that crash upon the shore but ultimately recede? Or is it somehow more permanent?

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

07 Oct 2021Seth Rosen: On Becoming a Full-stack Data Analyst00:38:51

Seth Rosen has broken data Twitter many times, and in his early-fatherhood sleep deprivation developed a wonderful Twitter persona as the battle-tested data analyst.

IRL though Seth is a serious data practitioner, and as Founder at the data consultancy HashPath has helped dozens of companies get into the modern data stack + build public-facing data apps.  Now, as the founder of TopCoat, he’s empowering analysts to build + publish those same public-facing data apps.

In this episode, Tristan, Julia & Seth graciously dive into spicy debates around data mesh + “dashboard factories”, and explore a future where data analysts become full-stack application developers.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

02 Dec 2022The Data Generalist's Vision Quest (LIVE w/ Stephen Bailey)00:26:37

The first LIVE IRL episode!  

Stephen Bailey, data engineer at Whatnot and writer of an incredibly entertaining data substack, joins Tristan for a follow-up conversation to Stephen’s Coalesce talk, “Excel at nothing: how to be an effective generalist.”

You can read Stephen’s writing at https://stkbailey.substack.com/.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. 

The Analytics Engineering Podcast is sponsored by dbt Labs.

25 Aug 2023Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex)00:42:53

Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in order for them to be useful for organizations, they need additional knowledge and context, and Jerry discusses how companies are bringing their data to tailor LLMs for their needs.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

09 Dec 2021[COALESCE] Data Analytics In A Snowflake World ft. Christian Kleinerman00:24:04

Where does Snowflake go from here? What meta trends and technologies play into that vision? How does that impact the world of data analytics?

Christian and Tristan have no shortage of opinions or ideas. This is your chance to hear some of them, live and unfiltered.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

09 Dec 2021[COALESCE] Peeking Into the Future of Data Analytics w/ Julia00:45:06

How is the data landscape evolving, what trends should you pay attention to and which should you ignore?

In this panel, Julia Schottenstein (our fearless co-host and dbt Labs product manager) catches up with Sarah Catanzaro, Jennifer Li and Astasia Myers to dive into the trends playing out in our work.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

07 Dec 2021[COALESCE] Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization (ft. Erica Louie of dbt Labs!)00:26:23

What is it like to build a data team for a company in the data space?

This talk is centered around how dbt Labs is building their data team. We will cover how our team is structured, how we operate and interact with the greater organization, and how we set expectations and responsibilities that are helping us become a self-service organization.

Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com.

The Analytics Engineering Podcast is brought to you by dbt Labs.

12 May 2024From Moneyball to Gen AI00:37:37

Eric Avidon is a journalist at TechTarget who's interviewed Tristan a few times, and now Tristan gets to flip the script and interview Eric. Eric is a journalist veteran, covering everything from finance to the Boston Red Sox, but now he spends a lot of time with vendors in the data space and has a broad view of what's going on. Eric and Tristan discuss AI and analytics and how mature these features really are today, data quality and its importance, the AI strategies of Snowflake and Databricks, and a lot more. Plus, part way through you can hear Tristan reacting to a mild earthquake that hit the East Coast.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

29 Jul 2021Brian Amadio: The Practice of Experimentation @ Stitch Fix00:38:48

Brian Amadio is a Data Platform Engineer at Stitch Fix, where experimentation underpins everything they do across merchandising, planning, forecasting, operations and more. 

In this conversation with Tristan, Julia, and Brian you’ll get into the weeds of executing multi-armed bandit experiments and learn how you can perform experiments even with limited data. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

08 Sep 2023The State of Databases Today (w/ Andy Pavlo)00:48:28

Andy Pavlo is a professor of databaseology (he says it's a made-up word) at Carnegie Mellon and currently on leave to build his own company—OtterTune, which uses AI to figure out the settings to get the best performance out of databases. He is one of the preeminent minds on databases and a die-hard relational database maximalist. We talk about the state of databases today, why there are so many specialized databases (and if we need so many), why tuning databases is so hard but important, and how the database landscape will evolve.

02 Dec 2021DeVaris Brown: Bringing Streaming Data to Analysts00:49:35

As a product leader at companies like Heroku and Zendesk, DeVaris specialized in building infrastructure-grade products. Currently, as the CEO of Meroxa, he enables teams to build real-time data infrastructure with the same ease as we now take for granted in batch.

In this romp of an episode, Tristan, Julia and DeVaris flow from his experience in tech mentorship, into the nuts and bolts of Change Data Capture (CDC), and how streaming data infrastructure can help data teams provide better end user experiences.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. 

The Analytics Engineering Podcast is sponsored by dbt Labs.

01 Jul 2021Robert Chang: Building the Minerva Metrics Store @ Airbnb00:37:30

Robert Chang is a product manager for the data platform at Airbnb, where he helped build and roll out Minerva, Airbnb's internal metrics store. They use Minerva to track over 12,000(!) metrics and 4,000(!) dimensions with consistency across the organization.

In this conversation with Tristan and Julia, Robert dives into why they built it, what it took to get it done—and crucially, what you should do if your company doesn't have the resources to build your own internal metrics store.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

17 Nov 2023Let's Talk About Data Vault (w/ Brandon Taylor and Michael Olschimke)00:44:04

If Data Vault is a new term for you, it’s a data modeling design pattern. We’re joined by Brandon Taylor, a senior data architect at Guild, and Michael Olschimke, who is the CEO of Scalefree—the consulting firm whose co-founder Dan Lindstedt is credited as the designer of the data vault architecture. 

In this conversation with Tristan and Julia, Michael and Brandon explore the Data Vault approach among data warehouse design methodologies. They discuss Data Vault’s adoption in Europe, its alignment with data mesh architecture, and the ongoing debate over Data Vault vs. Kimball methods. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

10 Feb 20233rd Party Data, Demystified00:45:26

Auren Hoffman currently serves as the CEO and Chief Historian at SafeGraph, a data-as-a-service company he founded, which provides primarily location data. 

In this conversation with Tristan and Julia, Auren shares how truly few companies are making use of 3rd-party datasets today, how opening up more datasets to public research could help us solve big problems, and a fun fact about Abraham Lincoln's (!) work in the industry. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. 

The Analytics Engineering Podcast is sponsored by dbt Labs.

20 May 2022"To Move, or Not to Move" (Data). That is the Question.00:40:21

Justin Borgman is the co-founder, Chairman and CEO of Starburst, and has almost a decade spent in senior executive roles building new businesses in the data warehousing and analytics space. 

In this conversation with Tristan and Julia, Justin dives into the nuts and bolts of Trino, the open source distributed query engine, and explores how teams are adopting a data mesh architecture without making a mess. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

10 Mar 2024AI's Impact in the World of Structured Data Analytics (w/ Juan Sequeda, data.world)00:48:18

Juan Sequeda is a principal data scientist and head of the AI Lab at data.world, and is also the co-host of the fantastic data podcast Catalog and Cocktails. 

This episode tackles semantics, semantic web, Juan’s research in how raw text-to-SQL performs versus text-to-semantic layer,  and where we both believe AI will make an impact in the world of structured data analytics.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

09 Sep 2021Caitlin Colgrove (CTO @ Hex): Notebooks for the Rest of Us00:40:00

Caitlin Colgrove is Co-founder & CTO at Hex, a data workspace that allows teams to collaborate in both SQL and Python to publish interactive data apps.

In this conversation, Tristan, Julia and Caitlin dive into the possibilities that real-time collaborative notebooks unlock for data teams — what if our collaboration style looked more like Google Docs than a Git workflow?

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

21 Apr 2024Being Pro-Human in the AI Era00:50:09

Barry McCardel is the co-founder and CEO of Hex. Hex is an analytics tool that's structured around a notebook experience, but as you'll hear in the episode, goes well beyond the traditional notebook.

We're big fans of Hex at dbt Labs, and use it for a bunch of our internal data work. In this episode, Barry and Tristan discuss notebooks and data analysis, before zooming out to discuss the hype cycle of data science, how AI is different, the experience of building AI products, and how AI will impact data practitioners.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

28 Jul 2023dbt Labs on dbt (w/ Daniel Le)00:30:40

Daniel Le is the CFO at dbt Labs where he has built multiple teams. He is also the former head of FP&A and operations at Zoom, and he helped scale FP&A as the former finance director at Okta. 

In this conversation with Julia, Daniel shares his view as CFO on the challenges SaaS companies face and the importance of finance teams creating a holistic view of their business. Daniel gives advice to data leaders about how they can automate business processes with dbt Cloud and use self-service analytics to automate revenue recognition, generate consistent headcount analytics, and more to impact their organization. Read more about Daniel’s story here.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

03 Jun 2022Building an Open Source Company (w/ Aaron Katz of ClickHouse)00:38:44

ClickHouse, the lightning-fast open source OLAP database, was initially released in 2016 as an open source project out of Yandex, the Russian search giant.

In 2021, Aaron Katz helped form a group to spin it out of Yandex as an independent company, dedicated to the development + commercialization of the open source project.

In this conversation with Tristan and Julia, Aaron gets into why he believes open source, independent software companies are the future. And of course, this conversation wouldn't be complete without a riff on the classic "one database to rule all workloads" thread.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

20 Oct 2024Coalesce 2024 edition: What’s next for data teams? (w/ Scott Breitenother)00:44:23

Show description: Scott Breitenother, founder of data consultancy Brooklyn Data Co., joins Tristan at Coalesce 2024 in Las Vegas to discuss the early days of dbt, the evolution of data teams, and what's next for the dbt community.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

27 Jan 2023A Romp Through Database History (w/ Postgres co-creator Mike Stonebraker + Andy Palmer)00:47:44

Mike Stonebraker is a veritable database pioneer and a Turing Award recipient. In addition to teaching at MIT, he is a serial entrepreneur and co-creator of Postgres.

Andy Palmer is a veteran business leader who serves as the CEO of Tamr, a company he co-founded with Mike. Through his seed fund Koa Labs, Andy has helped found and/or fund numerous innovative companies in diverse sectors, including health care, technology, and the life sciences. 

In this conversation with Tristan and Julia, Mike and Andy take us through the evolution of database technology over 5+ decades. They share unique insights into relational databases, the switch from row-based to columnar databases, and some of the patterns of database adoption they see repeated over time.

For full show notes and to read 7+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

04 Nov 2021Julien Le Dem: Why Data Lineage Matters 00:48:42

Julien has a unique history of building open frameworks that make data platforms interoperable. He’s contributed in various ways to Apache Arrow, Apache Iceberg, Apache Parquet, and Marquez, and is currently leading OpenLineage, an open framework for data lineage collection and analysis.

In this episode, Tristan & Julia dive into how open source projects grow to become standards, and why data lineage in particular is in need of an open standard.

They also cover into some of the compelling use cases for this data lineage metadata, and where you might be able to deploy it in your work.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

12 Jan 2025Data engineering at Snowflake (w/ Rahul Jain)00:44:04

A look inside at the data work happening at a company making some of the most advanced technologies in the industry. Rahul Jain, data engineering manager at Snowflake, joins Tristan to discuss Iceberg, streaming, and all things Snowflake. 

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

07 Apr 2023Julia, Pedram Navid + Taylor Murphy Recap Data Council00:42:03

Julia just got back from Data Council in Austin, a conference organized by Pete Sonderling, where lots of startups share what they're building, data practitioners go to learn in hands-on workshops, and of course investors go to spot the next big trend.

In this episode, Taylor Murphy (Head of Product & Data at Meltano) + Pedram Navid (Founder, West Marin Data) join Julia to recap the conference and have a bit of fun. They talked streaming, how the MDS is growing up, new SQL variants, and, of course, AI.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

24 Mar 2024How the Media Covers Gen AI (w/ Matthew Lynley, Supervised)00:48:15

Matthew Lynley is a bit of a hybrid. He's been a long-time journalist covering enterprise tech, currently in his fantastic AI and data newsletter Supervised, and he's also been a hands-on data practitioner. 

Matthew has covered the analytics tech stack, but this time Tristan turns the tables to get Matthew’s perspective on the rise of Gen AI as a topic in the popular press, what's going on in the space today, and where AI is headed.

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

The Analytics Engineering Podcast is sponsored by dbt Labs.

21 Oct 2021Benn Stancil: Friday Night (Data) Fights00:48:40

Benn is Chief Analytics Officer and a Co-founder at Mode Analytics, but you may know him from his Substack newsletter (benn.substack.com), where each Friday he dives into a semi-controversial topic (recent examples: “Is BI Dead?” and “BI is Dead”). 

In this episode, Benn, Tristan & Julia finally hash out some of these debates IRL: what *is* the modern data stack, why is the metrics layer important, and what’s the point of all of this?

For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com

The Analytics Engineering Podcast is sponsored by dbt Labs.

Améliorez votre compréhension de The Analytics Engineering Podcast avec My Podcast Data

Chez My Podcast Data, nous nous efforçons de fournir des analyses approfondies et basées sur des données tangibles. Que vous soyez auditeur passionné, créateur de podcast ou un annonceur, les statistiques et analyses détaillées que nous proposons peuvent vous aider à mieux comprendre les performances et les tendances de The Analytics Engineering Podcast. De la fréquence des épisodes aux liens partagés en passant par la santé des flux RSS, notre objectif est de vous fournir les connaissances dont vous avez besoin pour vous tenir à jour. Explorez plus d'émissions et découvrez les données qui font avancer l'industrie du podcast.
© My Podcast Data