Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Alan Turing Institute Webinar - The Free Pu...

The Alan Turing Institute Webinar - The Free Puppy: How to Responsibly Leverage Open Source AI

Over the past few decades, the trajectory of AI and open source software has evolved in tandem, but with the recent explosion of LLMs and foundation models, “open source AI” has become a catch phrase meaning so many things to so many different people. Some definitions of open source AI are so specific that they leave out hugely popular projects; other definitions focus specifically on models, and many AI projects marketed as open source don’t actually fit within the industry definition of open source. With so many different definitions floating around, how can you know what you are getting when you engage with open source AI?

In this talk, we will review what to look for when evaluating the health, stability, and practicality of open source AI projects (or any open source project for that matter), and why it’s so important to ensure transparency and explainability when deploying open source AI applications. In addition, in this webinar, we will cover licensing considerations, how to evaluate the health of open source projects, how and when to leverage open source, and the importance of community.

Avatar for Maureen McElaney

Maureen McElaney

February 05, 2026
Tweet

More Decks by Maureen McElaney

Other Decks in Technology

Transcript

  1. The Free Puppy: How to Responsibly Leverage Open Source AI

    Mo McElaney, Open Source @ IBM Thursday, February 5, 2026
  2. Definitions and Frameworks Open source 01 02 03 Who is

    calling their AI open source and why? How are people defining this and what is the impact? What is open source and how was it defined? Open source AI Table Of Contents
  3. Mo McElaney Lead, Open Source Developer Programs at IBM Research

    • Open source developer advocate and contributor • Open source community builder • Certified master gardener through the University of Vermont • At IBM Research focusing on open source AI and shepherding open source more broadly across the company • This talk contains my own opinions based on my experience, not acting as an official rep of IBM on these topics Hi! I’m Mo! About Me:
  4. Open source origins… 1. Freedom to impact or view 2.

    Freedom to use 3. Freedom to modify 4. Freedom to redistribute The Four Freedoms Popular open source licenses:
  5. A little history… Founding of OSI. Open source growth! To

    protect the open source definition which sets criteria for open source licencing. Cloud computing, enterprise software, machine learning frameworks like PyTorch, etc. AI is now model driven. What was once software driven has now moved to being model driven, raising questions about licensing. LLMs! Explosion of Large Language Models like GPT-3 raises concerns about AI’s openness and accessibility. 1998 2007-2015 2019-2020 2021
  6. History continued… Restricted access to AI models. ChatGPT goes viral!

    OpenAI, Google, Microsoft, and other companies restrict access to AI models. Pushing governments and researchers to discuss AI transparency and open access. OSI announces effort to define open source AI. OSAID = Open Source AI Definition. OSI releases v1 of OSAID The first `stable` version of the open source AI definition. 2022 Dec 2022 April 2023 Oct 2024
  7. Challenge #1 - Licensing Permissive Licenses AI should remain as

    open as possible for innovation. This maximizes flexibility but risks industry dominance. Restrictive Licenses Some want tighter control over AI models to prevent misuse. Some argue that this prevents misuse but limits adoption.
  8. Challenge #2 - Licensing LLMs Hugging Face Model Hub hosts

    more than 784k models and only 37% of them specify license information (as of March 2025) ◦ 66% of models with licenses are OSI recognized open source: 42% Apache and 18% MIT vs Non-OSI: 24% RAIL and 7.5% CC Open source software licenses dont fully fit large language models! LLMs require: • Massive data and compute resources • Unique challenges in reproducibility Community Data License Agreement (CDLA) is an alternative specifically for data sets OpenMDW License from LF is a global license covering all types: code, data, documentation (see https://openmdw.ai)
  9. Challenge #3 - Open Science vs Open Source March 2024

    - OSI considered making a dual definition approach (Open source AI vs Open science AI Key Distinctions: • Open Science has an academic focus, knowledge sharing and transparency • Open Source focuses on practical applications and industry adoption A single definition may not work due to the conflict between restrictive vs permissive community needs!
  10. Completeness vs Openness Openness Only considered open if distributed under

    an open license allowing users to freely access, use, modify, and share. Completeness Each element is thorough, self-contained, and meaningfully usable without requiring additional resources.
  11. Addressing challenges… We see a lot of “open washing”, lack

    of transparency, illegally converted software licenses being used for models, inappropriate types of licenses, etc. Just because you see a license doesn’t mean a model will fit your definition of open source!
  12. Model Openness Framework and Tooling Get involved! Why MOF and

    MOT? A straightforward tool for evaluating models against the MOF framework. See what components are included with each model and the licenses associated with them, providing clarity on what can and cannot be done with the model and its parts. If you are interested in contributing to this project you can get involved by joining the Generative AI Commons at LF AI & Data (anyone can join) and register for the meetings to get them on your calendar! https://genaicommons.org/ https://isitopen.ai/
  13. Plan celebrating Step 1: Define the Purpose and Goals Step

    2: Choose the Format Step 3: Set the Date and Time Step 4: Plan the Agenda and Time
  14. MOF Openness: Only open licenses accepted • To be considered

    open, every component MUST have an open license ◦ a license allowing unrestricted usage, study, modification, and redistribution for any purpose. • Every component SHOULD have a type-appropriate open license ◦ an open license that is appropriate for the type of component it is covering: code, data, or documentation. • Examples: ◦ Code: Apache 2.0 ◦ Data: CDLA-Permissive 2.0 ◦ Doc: CC-BY 4.0
  15. References Model Openness Framework (MOF) Paper • Offers guidance to

    researchers and developers seeking to enhance model transparency and reproducibility while allowing permissive usage. • Several updates since March 2024 MOF Specification • Provides a concise definition of the various classes and requirements • Focuses on the what rather than the why • 1.0 published in January 2025 Model Openness Tool (MOT) • Functional with 200+ models • Available at https://isitopen.ai
  16. Additional considerations… Other efforts exist: • GenAI Commons Responsible GenAI

    Framework (RGAF) • European Open Source AI Index • Stanford University Foundation Model Transparency Index • Responsible AI Licenses (RAIL) • …? MOF scope is limited and does not address many other aspects: AI safety (including bias, fairness, and trustworthiness), security and privacy, performance, etc.