Data Mesh is a new way of thinking about data that is based on a distributed data management architecture. The idea is to connect data owners, data producers, and data consumers to make data more accessible and available to business users. Data Mesh seems to be a promising data architecture. But how feasible it is in practice? In an insightful session at Worldwide AI Webinar, Patrick Klingler, the CDO Hub Lead at Mercedes-Benz answered this question for us.
The whole session is now available to watch on demand on our
website and
YouTube channel, along with the rest of the conference!
Key Takeaways
The current situation at enterprises prevents data sharing
According to Patrick Klingler, in the past, there was a strict separation between business and IT units. Even in the IT units, development teams and operation teams were working independently of each other despite being homogenous.
Today, most enterprises have product teams with cross-functional, end-to-end skills and responsibilities. People are also working in data lakes instead of data warehouses like before because of the increasing importance of semi-structured and structured data and because data lakes can accelerate data sharing.
However, there are still a few problems when it comes to scaling data sharing in enterprises:
-
Lack of incentives for high-quality data provisioning as data providers often fail to see the purpose of data sharing
-
Decoupling of the data provider and consumer
-
Central data teams become bottlenecks who have hands-on experience with data but little domain knowledge resulting in the inability to process and analyze data
Data Mesh architecture is a new paradigm for enterprise data sharing
With Data Mesh, data are treated as products and everything is built on a self-service data infrastructure.
Should we reach this paradigm, Patrick Klingler believed that there’d be great benefits, namely:
-
More data provided with larger quantity and higher quality
-
Facilitation of data consumption
-
No central bottleneck which enables better scalability
4 strategic pillars to transition towards Data Mesh
Vertical business architecture
Many businesses nowadays still have monolithic systems with long backlogs. Patrick Klingler recommended splitting it into a verticalized business landscape of digital products. Becoming a domain-oriented product organization with end-to-end skills should be the goal of companies that desire remarkable changes.
Data product thinking
Patrick regarded this as the core of Data Mesh. Data product thinking means considering data as a type of digital product rather than a side effect.
Establish a data infrastructure platform
Self-service platforms that provide storage, computing, API gateways, or data catalog among others are highly recommended.
Global standards and governance
Mr. Klingler suggested that business owners should make sure to have global governance and standards to make data more interoperable.
A data product is an operationalized data asset
Patrick Klingler defined a data product as “a data asset which is optimized for consumption”. This means a data product should:
-
be tailored to consumers’ demands
-
consist of a set of data, metadata, and code for processing
-
have a set of attached control policies
-
be provisioned and managed by data product teams
-
have a life cycle management
-
be registered on a data marketplace
He also detailed two approaches to identifying data products within organizations: a top-down approach and a bottom-up approach.
Clarifying Data Mesh myths
There are a few misconceptions about Data Mesh that Patrick Klingler has observed and would like to clarify.
He stated that Data Mesh isn’t a decentralized and uncontrolled data and platform architecture, a technology that can be simply installed or outsourced, or a one-time investment to ensure technical access to raw data for consumers.
Rather, Data Mesh is a decentralized data product responsibility based on common standards, a cultural change to how data is treated and emphasizes the focus on consumers’ demands.