Data Products in a decentralized world

xPhoto by Markus Spiske on Unsplash

What the heck is a Data Product?

Why do I keep capitalizing Data Products and not data contracts? Perhaps it’s because I can see data contracts taking many forms but I want to convey that Data Products are not just software products that are data. Of course they are that as well but also something more. A Data Product should meet certain standards of maturity in documentation, freshness, support, governance, and monitoring, etc…. In short Data Products should meet the same standards of other quality software products. Data consumers should be treated like proper customers and not an adhoc use case.

  • Data Product owner — to help articulate the business value and drivers
  • Software engineer — to instrument the source system
  • Data engineer — to build the data pipeline
  • Machine learning engineer — to optimize the algorithm for the runtime environment
  • Data scientist — to build an algorithm on the data
  • UI Engineer or Data Analyst — to build visualizations
Photo by Ricardo Gomez Angel on Unsplash

Decentralized Data Products

We value decentralization of decision making and autonomy of each domain to implement the best solution. Each Data Product should exist wholly within its particular domain and built by the respective group that owns that domain. The data pipeline, services, and visualizations that make up that Data Product should be cohesive and have single ownership. So how do we go about building these products without duplicating code and resources across our organization?



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Andres March

Andres March

bringing science and data together