Data Version Control: Why Git Isn’t Enough and What You Should Use Instead.

At a data meetup I attended recently, I overheard two people having a more-animated-than-expected conversation about data version control. Each had strong feelings about how best to track and manage changes to data files over time. One point they both agreed upon is that Git is a very useful tool, but maybe not optimal for

Data Version Control: Why Git Isn’t Enough and What You Should Use Instead.2025-10-14T13:00:39-05:00

Bridging the Gap Between Data Engineers and Data Scientists.

Not too long ago, I was working with an investment advisor who needed bond reference data. Her firm had recently brought on a bunch of new data scientists (a.k.a. quants in capital markets), and they were struggling to get their existing data engineering team in sync with their new colleagues. A common challenge for organizations

Bridging the Gap Between Data Engineers and Data Scientists.2025-10-08T15:15:59-05:00

The Cost of Bad Data: How Errors Ripple Through the Business.

I recently provided corporate actions data to an asset manager who had recently missed a voluntary event due to bad data. In this case, the bad data caused a monetary loss. Depending on your industry or the systems relying on the data, the impact of bad data can be much worse. So let’s explore how

The Cost of Bad Data: How Errors Ripple Through the Business.2025-09-02T15:43:17-05:00

Why Data Quality Nightmares Happen – and How to Stop Them.

As you might imagine, much of the work I’m doing nowadays is with companies trying to leverage AI. Each company and use case is different, but a common issue is data quality. So let’s take a look at why data quality issues arise, and how to prevent them… Every data team has faced the frustration

Why Data Quality Nightmares Happen – and How to Stop Them.2025-08-26T13:28:39-05:00

Data Pipeline Scaling: The Trade-Offs Between Speed and Reliability.

It’s a challenge humans have been struggling to overcome since the dawn of time. Building a solution that is fast, but also reliable. In Formula 1 auto racing, teams strive to build the fastest car possible. However, if their car never makes it to the finish line due to technical failures, it doesn’t matter how

Data Pipeline Scaling: The Trade-Offs Between Speed and Reliability.2025-08-19T12:54:40-05:00

The Hidden Struggles of Data Teams: Why Everyone Thinks It’s “Just Cleaning Data”

I recently had lunch with the founder of a business analytics company whom I mentor. Over some very tasty interior Mexican food, he vented about how his clients do not fully understand how hard it is to get data right every day. If I got a dollar every time I heard this, I’d be having

The Hidden Struggles of Data Teams: Why Everyone Thinks It’s “Just Cleaning Data”2025-08-13T15:59:57-05:00

The Metadata Imperative: Feeding Agentic AI the Inputs It Actually Needs.

Agentic AI systems don’t just passively receive commands. They plan, reason, and act autonomously to accomplish goals. But even the most capable autonomous agent is only as smart as the data scaffolding around it. The difference between brittle execution and adaptive autonomy? Complete, accurate, and structured metadata. To operationalize agentic AI, particularly in commercial environments,

The Metadata Imperative: Feeding Agentic AI the Inputs It Actually Needs.2025-07-29T15:04:18-05:00

8 Best Practices to Optimize Your Data Workflows for Accuracy and Efficiency.

There are so many new (cool) things happening with data nowadays, it’s easy sometimes to overlook the basics. So I thought it would be helpful to look at some “best practices” for setting up your data workflows. Effective data workflows are essential for maintaining data quality, promoting collaboration, and driving reliable insights across an organization.

8 Best Practices to Optimize Your Data Workflows for Accuracy and Efficiency.2025-07-08T16:08:17-05:00

Top 5 Challenges Facing Data Engineers & Scientists.

I recently spoke with the head of data and analytics at an investment management firm about how 2025 was going so far for him and his peers in the industry. Maybe not surprisingly, he said it’s been challenging. Setting aside geopolitics and market conditions, he said, throughout the industry, data engineers and data scientists are

Top 5 Challenges Facing Data Engineers & Scientists.2025-07-02T15:53:36-05:00

XtremeData and DIH Solutions work together to help firms onboard, clean, and understand their data

nLite metadata generator now available to capital markets participants via DIH. Schaumburg, Illinois, Austin, Texas, 24 June 2025: XtremeData, a provider of next-generation advanced data profiling and quality solutions, today announced a collaboration with Data In Harmony (DIH), a global financial data provider, to bring its flagship product, nLite, to institutional market participants. nLite helps

XtremeData and DIH Solutions work together to help firms onboard, clean, and understand their data2025-06-25T15:41:13-05:00
Go to Top