Nextdata is emerging from stealth mode today with the launch of a new toolset that’s designed to help organizations decentralize analytical data at large scale and finally realize the concept of the “data mesh” and its promise of revolutionizing how information is consumed.
Nextdata founder and Chief Executive Zhamak Dehghani (pictured), who coined the term data mesh, announced the company today at the Supercloud2 event held by SiliconANGLE and its video studio theCUBE. The idea of a data mesh has emerged as one of the hottest topics in the data analytics world.
The concept grew from the realization that the way companies manage their data is wholly inadequate for the way that information is used. Over the last 20 years, enterprises have slowly but surely begun decentralizing their organizations and granting more authority to the individuals that handle its products and customers. However, the data that these people use to inform their decision-making has remained centralized, stuck in a traditional data warehouse architecture.
In a blog post on Medium, Dehghani says the problem with this approach of managing data for analytics, artificial intelligence and machine learning is that it was conceived “in a paradigm where data was treated like oil: as a precious resource to be extracted, pumped via pipelines to a storage facility, processed and consumed.” As a result, organizations face a series of challenges around fragile data pipelines breaking, fragmented processes, proliferating silos and timely access to data. .
“Data producers spend their time on low-value processes — mostly moving data between systems — without ever fully considering how data can and should be used,” Dehghani said. “Merely accessing and understanding the data in centralized lakes requires specialized expertise.”
Dehghani believes that this inefficient data architecture is holding back data innovation. She explains that at a societal level, the need to collect and control data creates an “imbalance of power” that seriously limits AI innovation to those few stakeholders that have the largest piles of data. The result is either disparate and disconnected data sources, or monopolized, stale data collections that help no one, she said.
The data mesh is an entirely new concept for handling data that aims to address these challenges. A simple way of looking at it is that a data mesh invests ownership of data within the people who create it. The creators are tasked with ensuring the quality and relevance of the data, and they’re responsible for sharing it with others across their organization who may want to use it. The data mesh employs a consistent, organization-wide set of definitions and governance standards to ensure consistency, while a metadata layer is used to make it searchable, so others can find what they need.
According to Dehghani, a data mesh must have eight key attributes: The data must be discoverable, understandable, addressable, secure, interoperable, trustworthy and natively accessible and providing value.
“The data mesh is about sharing data responsibly across boundaries,” Dehghani said during theCUBE interview. “Those boundaries include organizational boundaries, cloud technology boundaries and trust boundaries.”
The concept is clear enough, but until now, most enterprises have lacked the tooling required to get started in creating a data mesh, and this is where Dehghani and Nextdata want to help. The company’s flagship offering is the Nextdata OS, a “data mesh-native” toolset for decentralizing data at scale. At the core is the “data product container,” which is a new unit of data value that’s designed to be shared responsibly and used at scale.
It’s somewhat similar to the idea of software containers that bundle code that can run on any computing platform. Instead, the data product container is filled with information, along with everything needed to make it usable, including the transformations, guarantees and policies that affect it.
Dehghani explained that Nextdata is the result of everything she has learned from attempting to implement a data mesh from scratch at a number of different organizations. “A lot of organizations I’ve worked with want decentralized data, so they really embrace the idea of decentralized data ownership,” she said. “But they also want interconnectivity through standard APIs, discoverability and governance. So we’re trying to find the common denominator that solves these problems and enables the developer experience for data sharing.”
This is why the other components of Nextdata’s architecture are all about making these data product containers useful. Dehghani said the company has developed analytical data product application programming interfaces that make data discoverable and interoperable. Nextdata’s APIs are an open standard that allows anyone to access its data products.
The startup has also created embedded computational policies, or data governance policies embedded into each data product container as code, to ensure that each one complies with a common set of rules and requirements. Finally, Nextdata introduces a set of dynamic data product discovery tools to ensure that users can find the data they need, understand it, trust it and explore its suitability for whatever task they have in mind.
“We already have a very disparate and disconnected set of technologies that are very useful for when we thought about data and processing as a centralized problem,” Dehghani said. “But when you think about data as a decentralized problem, the cost of integrating these technologies in a cohesive developer experience is what’s missing. So we’re standardizing and codifying this idea of a data product container that encapsulates data computation APIs to get to it in a technology-agnostic way, or in an open way.”
Nextdata’s aim is to become the go-to platform for peer-to-peer data product value exchange. Using Nextdata OS, data developers will be able to create, connect, share and manage self-contained data products that can easily be discovered by the people who need to use them.
Dehghani envisages that Nextdata OS will sit on top of existing data technologies such as Snowflake, Databricks and the cloud, which companies have already invested millions of dollars in and aren’t going to want to throw away their work, creating a cohesive, integrated experience where the data product is a first class primitive. “We’re trying to codify and create a new developer experience based on that,” she added.
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.