Slide 7
Slide 7 text
We talk about data science notions of “80-90%” of time spent on
data discovery, assessment, transformation:
● What was the original intent for data collection, the assumptions
made, “Why did we collect this?”, which protocol, what error
margins, what biases, etc. (in addition to the standard who?,
what?, when,? where? metadata for the data collection)
● In the broader sense, this area of concerns regarding metadata
is about developing an understanding of the data enough to
know how and when to trust it enough to use it
● You create a new data product by the process of “data
wrangling” and that should carry the lineage along