PinnedPublished inTowards Data Science5 Non-Obvious Ways to Make Data Engineers Love Working for YouAnd how to become a better data leader in the process.Jun 28, 2021Jun 28, 2021
Data Observability: Reliability In The AI EraFor GenAI, data observability must prioritize resolution, pipeline efficiency, and streaming/vector infrastructures.Dec 8, 2023Dec 8, 2023
Putting Data Lineage In ContextThere are different flavors of lineage, do you have the right tool for the job?Jan 27, 2023Jan 27, 2023
Why Data Cleaning is Failing Your ML Models — And What To Do About ItFor ML model accuracy, data cleaning alone is insufficient. Messy data environments produce sloppy data science. Here’s why.Oct 11, 2022Oct 11, 2022
How To Make Data Anomaly Resolution Less CartoonishFixing broken data doesn’t have to be a game of whack-a-mole. Here’s how to speed up your data incident resolution.Sep 8, 2022Sep 8, 2022
Data Observability First, Data Catalog Second. Here’s Why.You can’t realize the full value of a data catalog without data observability. Here’s why.Aug 11, 20221Aug 11, 20221
Published inCodeXBuilding An External Data Product Is Different. Trust Me. (but read this anyway)Developing an external data product is different, and let’s face it harder, than serving internal customers. We dive into 5 key…Jun 7, 20221Jun 7, 20221
Building Spark Lineage For Data LakesSpark lineage has been a blindspot for the data engineering industry so we set off to engineer a solution. Here’s how we did it.Jun 1, 2022Jun 1, 2022
Published inTowards Data ScienceWhat is a Data Reliability Engineer — And Do You Need One?As data roles continue to specialize, a new role has emerged: the data reliability engineer. But what is it and does your team even need…Mar 31, 20223Mar 31, 20223
Published inTowards Data ScienceData Observability vs. Data Testing: Everything You Need to KnowYou already test your data. Do you need data observability, too?Feb 12, 2022Feb 12, 2022