Thinking about our data team structure and splitting it into Data Automation engineers (primarily focused on pipeline development) and Data Custodians (responsible for business data transformations, data quality, and data governance).  Has anyone designed their data team in this way and if you have clear roles and responsibilities, please share?

592 views1 Upvote6 Comments
Sort By:
Oldest
Director of Data7 months ago
What would be the business value of doing this (rhetorical question)? I think the two groups would need a bi-directional communication path to prevent breakage in flow/process/compliance. Also, thinking about employee engagement and retention it feels like that split would generate a factory worker environment. I know not what you asked for, just food for thought.
1 1 Reply
VP of Data in Banking7 months ago

This is exactly where my head is at with this proposal.  I want the data team to be full stack everything from understanding business needs, developing processes to the governance of this data including data quality.    So the solution to this would be splitting out based on how we structure our data lake.  Source->Raw->Transformed->Curated.  One team is responsible to the ownership of the data in and from source systems and landing it in Raw all engineering and governance and the second team is responsible for the data from transformed and its uses across the enterprise ensuring it too is well government and has good data quality. 

Director of Data Platformss in Consumer Goods7 months ago
Yes, we have a Data Engineering practice that is separate from the Data Governance team.   

Build data pipelines: Managed data pipelines consist of a series of stages through which data flows (i.e. from endpoints of data acquisition to integration to consumption for analytics). Architecting, creating, and maintaining data pipelines will be the primary responsibility of the data engineer. Ensuring necessary mechanisms to guarantee data quality, completeness, and accuracy are part of the data pipeline design is accountability of the Data Engineers.

Drive Automation: The data engineer will be responsible for using innovative and modern tools, techniques and architectures to partially or completely automate the most-common, repeatable data preparation and integration tasks in order to minimize manual processes, as well as improve productivity. The data engineer will also assist with renovating the data management infrastructure to drive automation in data management and integration.

Maintain the Enterprise Data Warehouse: Data engineers are accountable for the Enterprise Data Warehouse (EDW) architecture, its implementation, optimization, and maintenance. Data engineers are responsible for designing, creating, and delivering datasets within the EDW to support analytics initiatives.

Collaborate across departments: The data engineer will need strong collaboration skills in order to work with various stakeholders within the organization. In particular, the data engineer will work in close relationship with Data analysts, Business analysts, and Data science teams in refining their data requirements for various data and analytics initiatives and their data consumption requirements.

Educate and train: The data engineer should be curious and knowledgeable about new data initiatives and how to address them. This includes applying their data and/or domain understanding in addressing new data requirements. They will also be responsible for proposing appropriate (and innovative) data ingestion, preparation, integration and operationalization techniques in addressing these data requirements. The data engineer will be required to train counterparts in these data pipelining and preparation techniques.

Ensure compliance with data governance and security: The data engineer is responsible to ensure that the data sets provided to users are compliant with established governance and security policies. Data engineers should work with data governance and data security teams while creating new and maintaining existing data pipelines to guarantee alignment and compliance.
2 1 Reply
lock icon

Please join or sign in to view more content.

By joining the Peer Community, you'll get:

  • Peer Discussions and Polls
  • One-Minute Insights
  • Connect with like-minded individuals
Director of Data Management in Consumer Goods7 months ago

Agree 100%, we split out like this as well.  DG sets policies/procedures and DE implements and sets up pipelines.

2
Chief Data Officer in Software7 months ago
There may be some overlap within individual data management /governance competencies, where those data engineers may be fulfilling some pipeline work for a data management function (like MDM and DQ), but the competencies here are distinct enough that I think it makes sense to have individual roles.  
Practice Head, Cognitive AI in Banking4 months ago
here are few high level responsibilities based on my experience setting up such teams.

Data custodians enforce data quality at all levels, implement data governance policies and ensures compliance, implement business logics and rules w.r.t. data storage, aggregation and sandbox, catalogue the data and finally colloborate with various stake holders

Data automation engineers develop the end to end pipelines, handle automation, set up infrastructure and monitor the same in terms of performance, usage, IAM etc. and finally are experts in various programming languages and cloud/on-premises platforms.

Content you might like

eBook/Kindle16%

Print Edition64%

PDF/Tablet10%

Audio Books (I prefer to listen)8%

Something else?1%

View Results
6.6k views6 Comments

Market Research12%

Build a Team47%

Build a MVP24%

Make a Business Plan12%

Prepare a Pitch Deck1%

another action (mention in comments)

View Results
5k views2 Upvotes3 Comments
919 views2 Upvotes