Slack
Data
Engineer
Zendesk
Principal Data
Engineer
Creator
Schemata -
Data Contract
Platform
Author
Data
Engineering
Weekly
Slide 3
Slide 3 text
What is Data Contract?
Slide 4
Slide 4 text
A data contract is a collaborative agreement
between the people who create data (producers)
and the people who use data (consumers). Data
Contract defines the structure and behavior of the
data, so that it can be exchanged seamlessly
between different systems.
What is Data Contract?
Slide 5
Slide 5 text
Data contracts are typically written in a
machine-readable format, such as Protobuf, Avro,
YAML or JSON. This makes it easy for computers to
understand the structure of the data and how it
can be used.
What is Data Contract?
Slide 6
Slide 6 text
What is Data Ownership?
Slide 7
Slide 7 text
Who Own Your House?
Slide 8
Slide 8 text
House is a social Property
Slide 9
Slide 9 text
What is this has to
do with Data
Contract & Data
Quality?
Slide 10
Slide 10 text
Data is Inherently Social in Nature
Slide 11
Slide 11 text
Typical Data Model
Slide 12
Slide 12 text
Producer - Consumer(s)
Slide 13
Slide 13 text
Data Practitioners
Slide 14
Slide 14 text
Who Owns the
Data Quality?
Slide 15
Slide 15 text
Is there any better
framework
available than
ownership?
Slide 16
Slide 16 text
RACI Matrix
R (Responsible) someone who is responsible
for and is the executor of a
particular process
C (Consulted) a person who consults and
provides necessary data to
implement the process
A (Accountable or Approver) someone who is responsible
for the result of the work
I (Informed) a person who must be
informed of the progress of
the work
Slide 17
Slide 17 text
RACI Framework for ET (L)
Data Creation
Data Transformation
1
2
Slide 18
Slide 18 text
Data Creation
Product
Managers
Developers Data
Engineers
Data Analyst/
Scientist/ ML
Engineers
Business
Stakeholder/
Privacy &
Governance
R
(Responsible)
R
(Responsible)
C (Consulted) A
(Accountable
or Approver)
I (Informed)
Slide 19
Slide 19 text
Data Transformation
Product
Managers
Developers Data
Engineers
Data Analyst/
Scientist/ ML
Engineers
Business
Stakeholder/
Privacy &
Governance
C (Consulted) C (Consulted) R
(Responsible)
&
A
(Accountable
or Approver)
I (Informed)
Slide 20
Slide 20 text
Data Quality always
be defined from the
consumer's
perspective.
Slide 21
Slide 21 text
* The logos and trademarks displayed in this presentation are the property of their respective owners. The use of these logos and trademarks does not imply endorsement or sponsorship by the
respective owners.
Contract
Testing
Mock
Servers API
Specification
Testing
End-to-End
Testing
Acceptance
Testing
Consumer
Driven
Testing in
Software
Development
Slide 22
Slide 22 text
Data Quality is a
Collaborative
Workflow & Data
Contract is the
Enabler