Post

Data Stewardship

Data Stewardship is the management of an organization’s data to provide business users easily accessible, high-quality and relevant data.

1. Context

Data Stewards are people that are in charge of representing business stakeholders interests while ensuring enterprise data is of high quality and used effectively. They usually are subject experts in their respective fields, and they understand their environments and the related elements that compose them. We usually talk about data domains in a company, and Data Stewards are specialized people in one or domains, making responsible, accountable and animators of these data perimeters.

They have an important role related to Data Governance, as they are the operationals in charge of applying and developping the rules and processes designed and announced by the Data Governance team. As experts in their field and in their data domains, they know how to translate high-level guidelines and principles into reality applied to their domain.

They also are responsible and accountable for the level of documentation of data in their data domains. They understand datasets, their content and meaning, their potential use and the benefits analysis could bring to business, etc. By definition, they are also the best suited persons to ensure data is well documented, traced and understood by other entities in the organization.

Data Quality must also be one of their top-priority concerns, with the local data actors in their related data domains. Being the bridge between business users and data technical experts is one of their role, for data in their domain to reach a sufficient level of quality. Ensuring data created or provided by their data domain have good Data Quality standards must be one of their regular daily goal.

Data Stewards

Great Data Stewards are people with a large panel of skills, business-centered or technical-related seemlessly. They must have a strong business know-how to be the bridge between business, operations and data teams. They also should have a strong technical skills baggage, as they need to understand how their domain’s data are used through technologies and platforms. Analysis understandings are also very important as they can be considered guardians of data usage, and they must understand the various uses of data. Finally, high interpersonal skills are important for them to draw relationships and become the web of their data usage.

2. Missions & roles of Data Stewards

Data Stewards have various roles and mission in an organization. Among these roles, ensuring Data Quality, Data Documentation and Data Governance are the most important ones. They are decomposed in multiple daily activities, that make them real actors of data in their respective data domains.

Ensuring Data Quality is a very complex task that can be lead by Data Stewards. Data cleansing, data profiling and root cause analysis are technical solutions to guarantee high-quality data. However, it is very important Data Stewards have a closed link with their local Data Engineers responsible of the data pipelines. Making these two roles working together is essential for a successful Data Quality program.

Documenting domain data is a capital task for Data Stewards, aside being the opening door for other domains collaboration. Maintaining a business glossary with users feedbacks, documenting data dictionaries and data definitions, as well as contributing to the Data Catalog are very important tasks for a Data Steward to maintain high level standards for their data in their data domain.

Enforcing operational Data Governance must be a daily concern for Data Stewards in their data domains. Indeed, data are shared, reused, transformed, visualized, analyzed, etc. Having a clear overview of data usages within an organization is the responsbility of the respective Data Stewards. Ensuring data security & protection, application of rules and policies must have their full attention.

Data Community

Besides all these strict and difficult expectations around Data Stewards, they are also expected to become animators of their proper data domains, and try to identify ways to improve usage of data. They really are interface points between various domains, business and technical teams, contributing to a data community.

This post is licensed under CC BY 4.0 by the author.