Thoughts from Gartner Data and Analytics Summit
The annual gathering of data and analytics gurus is always an occasion for elaborating buzzwords either coined or propagated by Gartner analysts. This year, the trendy subjects for data management are Data Fabric, Data Mesh, Lakehouse (in terms of architecture), Active Metadata Management, Data Observability and Augmented FinOps (in terms of capabilities).
Metadata Is Everywhere
No matter what architecture or architectures an enterprise chooses (for many organizations, it is not an either-this-or-that but a hybrid approach), data management in general and data governance in particular starts with metadata. Metadata is everywhere.
The challenge is to ingest metadata intelligently and automatically from a wide range of technologies. Also, as Mark Beyer, Distinguished Vice President Analyst at Gartner, repeatedly points out, “code is metadata.” Can you scan the source code? If you have the Mainframe, you want COBOL, JCL, IMS and PL1 covered. For AS/400 customers, RPG is critical. As many new applications are developed in Python and Java, support for those programming languages is imperative.
Connective-Tissue
The next important step is connectivity or “connective tissue” as the Gartner analysts call it metaphorically. By definition, metadata of all types–business, technical, operational, and social– is weaved into a data fabric.
Data Mesh establishes data domains with data products. However, you will need intra-domain connectivity to facilitate the creation and maintenance of data products. What is more, you also need inter-system connectivity, or an infrastructure as a service, to enable the consumption of data products.
Interconnectivity is prerequisite to a Lakehouse implementation since you need connectivity between different types of data—structured, semi-structured, and unstructured—and diverse workloads and technologies—SQL, analytics, data science, and machine learning. Such connectivity is an integral part of the metadata and governance layer of a Lakehouse. It is also vital in a hybrid cloud environment, which is commonplace in many large enterprises.
Metadata Management
Orion Governance’s Enterprise Information Intelligence Graph (EIIG) is a Metadata Management platform that offers a fully automated and robust intra and inter system connectivity to support Data Fabric, Data Mesh and Lakehouse architectures. It can automatically ingest metadata from more than 60 technologies and intelligently build a self-defined data fabric from bottom up.
EIIG supports data mesh deployment by offering capabilities such as capturing and maintaining domain knowledge and business semantics as plain sets of business terms and definitions, conceptual business models, or full-scale domain ontologies with logical relations, rules, and restrictions between business concept. EIIG is able to do so by leveraging a supervised ML approach.
Enterprises can also leverage EIIG to assure the success of their Lakehouse solution with end-to-end data lineage no matter what technologies they incorporate in their implementation.
EIIG is not only a platform that supports the needs for all types of data management architectures, but it also offers the aforementioned data management capabilities in one capacity or another.
Map of Data Landscape
With active metadata, EIIG creates a living and breathing map of your data landscape. You don’t just see how data is connected. You also get actual “traffic” information or insight of the data in near real-time. For example, EIIG records active metadata/operational metadata, including data quality data from its origin to destination through various systems. This augmented feature enables data citizens to identify and use data the best way possible and promotes data trust.
EIIG offers data observability that starts with the capability of parsing the minutest details of metadata. By scanning the source code, EIIG is able to get the DNA of a dataset. Data transparency based on this level of granularity is the basis of a comprehensive data observability regime. In addition to comprehensiveness, EIIG also provides accurate and timely visibility.
Identify and Solve Problems Faster
To continue the map analogy: if you can visualize data flow with all necessary details in near real-time, you can identify issues and their root causes quickly. With intelligent alerts such as violations of SLAs, you can be more effective in troubleshooting problems. Combined with features like real-time impact analysis, EIIG’s data observability capability empowers enterprises to be more proactive in preventing problems from occurring in the first place.
FinOps is closely linked with data observability and active metadata management. After all, cost optimization is one of the key outcomes that data management and data governance aims to achieve. EIIG can help enterprises observe cost and financial impact related to data.
In the case of cloud migration, for example, EIIG observes and automatically identifies duplicated data assets such as tables, ETL jobs, or BI reports so that enterprise customers can minimize data bloat. By doing this during the migration readiness assessment process, customers can accelerate both the actual migration as well as save cost.
By providing visibility into the value and popularity of data, organizations can more accurately decide which workloads to migrate first. With insight from this continuous observing and monitoring of data, CFOs can better predict the cost of cloud migration and budget it with the confidence that it will not overrun.
Despite some wild claims, no vendor out there can fulfill all data management needs with one single solution. However, with EIIG, enterprise customers will have a foundational platform on which to build other capabilities and eventually reach their data management nirvana.
About the Author: Niu Bai, Ph.D. is the Head of Global Business Development at Orion Governance, Inc. Connect with Niu on LinkedIn.