$30 off During Our Annual Pro Sale. View Details »

Gender-diversity analysis of the technical contributions in OpenStack

Gender-diversity analysis of the technical contributions in OpenStack

Presented at OpenStack Summit in Austin, Texas.

https://www.openstack.org/summit/austin-2016/summit-schedule/events/8723

Bitergia
PRO

April 28, 2016
Tweet

More Decks by Bitergia

Other Decks in Technology

Transcript

  1. Gender-diversity analysis
    of technical contributions
    Daniel Izquierdo Cortázar
    @dizquierdo
    dizquierdo at bitergia dot com
    OpenStack Summit, Austin 2016

    View Slide

  2. Outline
    Introduction
    First Steps
    Some numbers and method
    Conclusions

    View Slide

  3. Introduction
    A bit about me
    Why this analysis
    What we have so far

    View Slide

  4. /me
    CDO in Bitergia, the software development analytics
    company
    Involved in the OpenStack Activity Board
    Involved in the OpenStack Quarterly Reports
    Disclaimer: not involved in WOO working group, own analysis
    and interest, I may have missed some stuff...

    View Slide

  5. Why this study
    Diversity matters
    I attended some WOO (Women of OpenStack) talks in Tokyo
    There are not numbers about technical contributions (AFAIK)
    How’s this evolving? Is gender-diversity increasing?
    In the end this is all about transparency and improvement

    View Slide

  6. What we have so far
    OpenStack related resources:
    - Linkedin WOO: 600 members, 137 discussions
    - WOO mailing list: 380 emails, 140 threads, 90
    participants
    - WOO wiki

    View Slide

  7. What we have so far
    Others of interest:
    FOSS Survey in 2013:
    - http://floss2013.libresoft.es/results.en.html
    - 11% of women answered the survey
    The Industry Gender Gap by the World Economic Forum.
    - 5% for CEOs, 21% for Mid-level roles, 32% of Junior roles

    View Slide

  8. Some companies
    Pinterest Engineering
    focused employees.
    https://blog.pinterest.com/en/our-
    plan-more-diverse-pinterest

    View Slide

  9. Some companies
    Google Tech focused
    employees.
    http://www.google.com/diversity/

    View Slide

  10. Some companies
    Facebook Tech focused
    employees.
    http://newsroom.fb.
    com/news/2015/06/driving-
    diversity-at-facebook/

    View Slide

  11. Some companies
    Dropbox all employees.
    https://blogs.dropbox.
    com/dropbox/2014/11/strengtheni
    ng-dropbox-through-diversity/

    View Slide

  12. Summary
    Conclusions not representative, but:
    - Women represents around 30%/40% of the workforce in
    tech companies.
    - And between 10% and 20% if focused on tech teams.
    - What about OpenStack?

    View Slide

  13. First Steps

    View Slide

  14. Some Definitions
    Technical contributions: commit, upload, gerrit vote
    Other potential metrics: diversity by company, fairness in the
    code review among organizations and genders, transparency
    in the process
    Available but sensitive info: affiliation, countries, time to
    review

    View Slide

  15. First Steps
    Names databases
    Genderize.io
    Manual analysis
    Focus on main developers

    View Slide

  16. Architecture
    Original
    Data Sources
    Mining
    Tools
    CVSAnalY
    Bicho
    SortingHat
    Info
    Enrich.
    Genderize.io
    Pandas
    Jupyter
    Notebooks
    Manual work
    Viz
    ElasticSearch
    +
    Kibana

    View Slide

  17. Architecture
    Original
    Data Sources
    ● Git and Gerrit repos based on yaml at
    Governance
    ● ~ 370k commits
    ● ~ 250k changesets
    ● ~ 840k Patchset uploads
    ● ~ 1,124K patches code reviews

    View Slide

  18. Architecture
    Mining
    Tools
    CVSAnalY
    Bicho
    SortingHat
    ● CVSAnalY and Bicho ddbb publicly available
    ● Activity Board (http://activity.openstack.org/dash/browser/data/db/
    )
    ● SortingHat db available under request
    (sensitive info)
    ● Bonus: now migrating to GrimoireLab

    View Slide

  19. Architecture
    Info
    Enrich.
    Genderize.io
    Pandas
    Jupyter
    Notebooks
    Manual work
    ● Genderize.io: name database
    ● Pandas: data analysis lib.
    ● Jupyter Notebook: web app. For data
    analysis
    ● Manual work:

    View Slide

  20. Architecture
    Viz
    ElasticSearch
    +
    Kibana
    ● ElasticSearch: Schemaless db
    ● Kibana: works great with ES
    ● This tandem helps a lot to verify info
    ● Drill down capabilities
    ● Extra info available (but not displayed)

    View Slide

  21. Validation: manual work
    Check main contributors by hand
    Be sure the WOO Wiki contributors are correct
    Asian names hard to check ( u_u ) (help needed!)
    Others...

    View Slide

  22. Some numbers
    Git Contributions
    Gerrit Reviews
    Demographics

    View Slide

  23. Git Overview
    ● Aggregated historical
    data
    ● Repos based on the
    Governance yaml file

    View Slide

  24. Git Activity and Population
    Women activity (all of the history):
    ~ 10,5% of the population ( ~ 570 developers )
    ~ 6,8% of the activity ( >=16k commits )

    View Slide

  25. Git Activity and Population
    Women activity (last year):
    ~ 11% of the population ( ~ 340 active developers )
    ~ 9% of the activity ( >=6k commits )

    View Slide

  26. Git WOO Main Projects
    ● Where do WOO
    contributions go?
    ● No-filtered order: Infra,
    Nova, Neutron, Doc, QA
    ● Lots of activity in Doc,
    Infra, Neutron, Nova and
    Horizon

    View Slide

  27. Git WOO Type of Contribution
    ● Where do WOO
    contributions go?
    ● No-filtered order: Infra,
    Nova, Neutron, Doc, QA
    ● Lots of activity in Doc,
    Infra, Neutron, Nova and
    Horizon

    View Slide

  28. Git WOO Evolution
    ● Similar trend than the overall evolution
    ● But slightly better during the last year
    ● Peaks during March, June August and November 2015 (any clue?)

    View Slide

  29. Git WOO Evolution (peaks)
    ● March 2015: Extra activity in Ironic
    ● June 2015: Extra activity in Doc and Puppet OpenStack
    ● August 2015: Extra activity in Infra and Doc
    ● November 2015: Extra activity in Doc and OpenStack Client

    View Slide

  30. Gerrit Overview
    ● As an example the
    aggregated history of
    the project
    ● Repos based on the
    Governance yaml file

    View Slide

  31. Gerrit Reviews
    ● ~ 1 Million reviews
    ● ~ 400k ‘+2’ reviews
    ● ~ 11k ‘-2’ reviews
    ● ~ 325k ‘+1’ reviews
    ● ~ 207k ‘-1’ reviews

    View Slide

  32. Gerrit Reviews Evolution by WOO
    Continuous increase
    Big Jump during the last year (if compared to general trend)

    View Slide

  33. Gerrit Reviews Evolution by WOO
    And that jump is even higher when checking +2 reviews
    Up to 3 times the ‘+2’ activity from 2014 to 2015
    (This behaviour does not follow the general trend)

    View Slide

  34. Demographics
    Attraction of female developers to the community
    Peak on 2015 Q3 with 62 developers
    [chart measures the first contribution by each developer and groups by quarter]

    View Slide

  35. Demographics
    Female developers leaving the community
    [active developer = at least a commit during the last year]
    [chart measures the last contribution by each developer and groups by quarter]

    View Slide

  36. Demographics: extra bonus
    When were born the developers contributing during the last quarter?
    And who are they? Working for? Working at?

    View Slide

  37. Demographics: extra bonus
    And the other way around:
    How good are we retaining developers that entered in 2013-Q1?
    (And who are they? Working for? Working at?)
    [19 attracted in 2013 Q1. 6 left in that quarter. 7 are still contributing. Another 6 left
    in other periods]

    View Slide

  38. Analysis Is Outreachy helping the gender-
    gap?

    View Slide

  39. Outreachy
    “Outreachy helps people from groups underrepresented in free and
    open source software get involved”
    Is helping Outreachy to decrease the gender gap in OpenStack?
    How’s performing the community to retain these developers?
    And how’s the overall performance of the community retaining
    developers?

    View Slide

  40. Outreachy
    Studied 4 periods
    2 devs. still contributing
    (commits)
    Better retention
    More attracted women
    (mostly paid by orgs.)

    View Slide

  41. Outreachy

    View Slide

  42. Outreachy
    More women are attracted and retained
    (also in relative numbers) thanks to
    organizations in OpenStack.
    Even though, some numbers from tech
    companies show a higher % of women.
    Is it worth exploring to invest other
    resources in companies to kindly let
    know about this?
    What about exploring high school
    focused actions? (prior degree studies)
    Disclaimer: Just some ideas!

    View Slide

  43. Conclusions
    Answer to First Questions
    Data to Make Decisions
    Open Paths

    View Slide

  44. Some Answers
    Continuous increase of activity and population (up to 11%)
    Outstanding increase in core review contributions
    Most of the women come as new orgs. join the Foundation
    Tooling is useful to have number, compare and make decisions

    View Slide

  45. Conclusions
    Room for improvement of the dataset
    This provides some initial numbers about the current status
    Hopefully useful for the WOO working group and the project

    View Slide

  46. Open Paths
    How this may help the challenges detailed by the WOO:
    - Close to 550 female developers (more than 200 with a
    100% of probability)
    - Talk to them, send an email, let them participate, have
    meetings, ask for mentorships
    - Detection of new women entering the community, say
    hello!
    https://wiki.openstack.org/wiki/Women_of_OpenStack

    View Slide

  47. Further Work
    Sensitive info: dashboard still private
    Extra analysis: time to merge fairness, companies women %,
    Outreachy follow ups, quarterly reports, updated data,
    specific policies ROI and others.
    This [hopefully] helps to have a better picture
    Looking for sponsors!

    View Slide