Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Updating Data Programs with Responsible and Eth...

Updating Data Programs with Responsible and Ethical AI

AI is a hot topic for the world and often a challenging subject in traditional data programs. What will be your response when AI is introduced to your organization? This workshop features discussions of recent successes and disturbing incidents. We will focus on understanding how they were discovered, their handling, and lessons learned. We will focus on people and process issues encountered with each incident. We'll end with 10 tips to help you build a strategy for updating your data governance and management programs.

Highlights:
Transparency
Data Quality
Fairness
Human Verification
Privacy and Data Protection

Avatar for Karen Lopez

Karen Lopez

June 24, 2025
Tweet

More Decks by Karen Lopez

Other Decks in Technology

Transcript

  1. Karen Lopez Microsoft MVP, Data Platform Microsoft Certified Trainer, vExpert

    Data management expert, space enthusiast, and #TeamData evangelist www.datamodel.com @datachick.bksy.social
  2. 3

  3. 4

  4. Bjarni Valdimar Tryggvason Icelandic-born Canadian engineer and a NRC/CSA astronaut.

    He served as a Payload Specialist on Space Shuttle mission STS-85 in 1997, a nearly 12-day mission to study changes in the Earth's atmosphere. Bjarni is the first, and as of 2024, only Canadian astronaut of Icelandic birth. 5 https://en.wikipedia.org/w/index.php?title=Bjarni_Tryggvason&oldid=1289917372
  5. 2 AI Perspectives Preparing and managing data for AI uses

    Leveraging AI for data management and govenrance programs
  6. Why This Matters AI is here We need to be

    prepared We need to be responsible We might want to be lazy
  7. Letter Core Principle Explanation F Fairness Avoiding bias, ensuring equity

    across demographics, and addressing systemic inequalities in data and outcomes. A Accountability Assigning clear responsibility for AI decisions, enabling auditability, and ensuring recourse mechanisms. S Sustainability Minimizing environmental impact, ensuring long-term viability, and supporting social sustainability. T Transparency Making AI systems explainable, understandable, and open to scrutiny by stakeholders.
  8. By 2015, it was clear that the system was not

    rating candidates in a gender-neutral way because it was built on data accumulated from CVs submitted to the firm mostly from males, Reuters claimed. The system started to penalise CVs which included the word "women". The program was edited to make it neutral to the term but it became clear that the system could not be relied upon, Reuters was told.
  9. Other fake headlines said that U.S. Secretary of Defence nominee

    Pete Hegseth had been "fired," that Secretary of State nominee Marco Rubio had been "confirmed," and that Israeli Prime Minister Benjamin Netanyahu had been "arrested." None were true.
  10. IBM Watson for Oncology In 2018, IBM’s Watson for Oncology,

    lauded as a revolutionary tool for AI- enabled personalized cancer treatment, encountered significant setbacks due to inaccuracies and unsafe treatment recommendations. The system's reliance on synthetic data, coupled with limited real-world patient data, underscored the critical importance of data quality and diversity in AI-driven healthcare solutions. Consequently, the accuracy and efficacy of the AI- generated outcomes were insufficient, and IBM ultimately decided to discontinue its Watson for Oncology solution. This case exemplifies the imperative for rigorous data validation protocols to generate high-value recommendations; further, an overreliance on synthetic data can diminish AI effectiveness and model accuracy. Further, this AI setback may also be an example of a problem better left to humans – in this case oncologists with years of specialized training, experience, and highly- contextual knowledge of the most complex of systems, the human body. https://www.ethicsc.harvard.edu/blog/post-8-abyss-examining-ai-failures-and-lessons-learned
  11. Air Canada was taken to court and asked to pay

    a refund offered by its chatbot, the company tried to argue that “the chatbot is a separate legal entity that is responsible for its own actions.” Air Canada’s argument was that because the chatbot response included a link to a page on the site outlining the policy correctly, Moffat should’ve known better. At the moment, the Air Canada chatbot is not on the website.Feel free to imagine it locked in a room somewhere, having its algorithms hit with hockey sticks, if you like
  12. The real estate listing company took a $304 million inventory

    write-down in the third quarter, which it blamed on having recently purchased homes for prices that are higher than it thinks it can sell them. The company saw its stock plunge and it now plans to cut 2,000 jobs, or 25% of its staff. The algorithms continued to assume that the market was still hot and overestimated home prices. In machine learning (ML), this kind of problem is known as “concept drift” and this does appear to be at the heart of the problem with Zillow Offers. https://insideainews.com/2021/12/13/the-500mm-debacle-at- zillow-offers-what-went-wrong-with-the-ai-models/
  13. Data Governance Components • Business Goals & Stakeholders • Ethics

    and Repsonsiblity • Data Privacy, Security, and Compliance • Policies & Standards • Montitoring and Meausring • Data Quality • Metadata • Reference Data Management • Data Catalogs and Inventories
  14. Data Governance • Change Management • Program and Project Management

    • Data Lifecycle Management • Data Literacy & Culture • Data Products • Data Contracts • DG Operating Models • Data Governance Goverance ☺
  15. Data Management Components Data Architecture Data Modeling & Design Data

    Storage & Operations Data Security Data Integration & Interoperability Document & Content Management Reference & Master Data Data Warehousing & Business Intelligence Metadata Data Quality
  16. DataOps Components • Automated Data Pipelines • CI/CD Approaches •

    Data Quality Monitoring • Data Testing and Validation • Version Control • Security & Access • Operational Monitoring • Ochestration • Collaboration • Agile & Iterative
  17. FUD Trust Job loss potential Change fatigue Loss of control

    fears Ethical / legal concerns People Challenges
  18. Fix it Start with Co-Piloting & laziness Encourage traceability analysis

    and work Engage everyone Measure success rates Lots of training and support
  19. Where AI Can be Used in Data Programs Data Quality

    Data Classification Data Preparation Data Security Data Observation & Monitoring Data Support and Literacy Data Lineage Detection Glossary Building Auditing and Compliance Anomaly Detection Data Profiling Data Growth and Capacity Planning Generative use cases – test data Semantic Data Mapping Ethical Risks Data Cleansing
  20. What Karen Worries About Not enough time allocated People being

    people Magical thinking Missed disclosures Biased data & files Skills gaps High-risk blindness Data & Model Poisoning Legislation that can’t keep up AI trained on AI slop (model collapse) Model drift
  21. Data Literacy Challenges Lessons learned from the DW era Data

    history & stories Stats knowledge Understanding bias Understanding AI limits Dealing with missing data
  22. Data Program AI Readiness Invest in Data Governance Integrate Data

    Governance with AI Governance Improve Data Quality processes Implement Bias Detection and Montoring Ensure Documentation Use Synthetic Data
  23. Data Management AI Readiness Refocus on metadata programs Strengthen security

    measures Do more automation Build a data-driven culture Human-in-the-loop Respect Data Privacy Data Program AI Readiness
  24. Use Protection Techniques Protect and Assess Training Data Use Third-party

    reviewers Design for Accessibility and Inclusion Learn AI Safety Defenses Leverage Ethical Frameworks Share Professional Learning Data Program AI Readiness
  25. 10 Tips Do personal learning and education on AI topics

    Speak up Build up your Data Governance programs Bring metadata methods back to the forefront Communicate, often, the importance of monitoring and auditing
  26. 10 Tips Leverage AI, smartly and ethically to build these

    programs Enhance all data management segments with AI Ensure data professionals are part of the strategic planning Evangelize the importance of cross-group collaboration Build Data Literacy programs