Data Catalogs enable data-driven enterprises
The amount of data is constantly growing, as is the number of formats in which the data is available. Searching for and retrieving data thus becomes a real challenge. Everyone knows the situation of not finding a market study or a sales report at the place it is supposed to be in the company's information systems. Valuable time is lost in finding the right person to ask for help.
For any company that wants to use Big Data and engage in data analytics, it is essential to make datasets available in a transparent manner throughout the company. A data catalog empowers users to easily find, access and use data.
What is a Data Catalog
The idea of a data catalog seems intuitive. It can be compared to a library that makes the desired data sets available to all users. In a broader sense, a data catalog can be seen as a platform that matches data supply and demand.
"A Data Catalog is an integrated platform for data curation, bringing data supply and demand together. It offers users functions to register data; retrieve and use data; and assess and analyze data. A Data Catalog, therefore, should provide a data inventory (for data supply) and features for data discovery (for data demand) as key components. Additional features should support data governance, data assessment, and data analytics, alongside with appropriate features for catalog administration and data collaboration."
How to implement a data catalog
In reality, however, it is challenging for companies to define what scope their data catalog should have and which users it should serve. Uncertainty also arises from the fact that the market for data catalog solutions is still quite dynamic, with an increasing number of data catalog tools with varying scope and functionality being offered by multiple vendors.
CC CDQ Data Catalog Reference Model
Data catalogs have been in the focus of research activities at the Competence Center Corporate Data Quality (CC CDQ), resulting in the CC CDQ Data Catalog Reference Model and a market study. These results help companies to evaluate the data catalog solutions available on the market and to select the most suitable solution for their specific requirements.
A total of 29 data catalog solutions were identified, fifteen of which are part of the detailed market analysis in this report. We found that there is a wide range of tools marketed under the term "data catalog" covering different functional areas and user requirements.
The complete study is available for free for members of the CC CDQ at the CC Wiki or can be purchased at the Fraunhofer-Bookshop.