Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ask The Ecosystem (PyBay 2019)

Ask The Ecosystem (PyBay 2019)

Data, trends, and lessons from 250+ FOSS Python applications, applied toward a better development process.

Mahmoud Hashemi

August 18, 2019
Tweet

More Decks by Mahmoud Hashemi

Other Decks in Technology

Transcript

  1. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco

    View Slide

  2. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    ?
    ?
    ?
    ? ?
    ?
    ?
    ?
    ?

    View Slide

  3. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    ASKTHEECOSYSTEM
    Lessons from 250+ FOSS applications
    Mahmoud
    Hashemi
    PyBay
    2019

    View Slide

  4. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    ○ Anki
    ○ BleachBit
    ○ Deluge
    ○ FreeCAD
    ○ Home Assistant
    Recognize any of these names?
    ○ MusicBrainz Picard
    ○ Odoo
    ○ Reddit
    ○ sabnzbd
    ○ youtube-dl

    View Slide

  5. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    They are in fact...
    1. Popular software
    2. Targeted at a non-programmer audience
    3. Written in open-source Python (but not marketed as such!)
    Pretty inspiring, right?

    View Slide

  6. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    LET’S TALK GOALS
    Why do we code?

    View Slide

  7. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Reasons to code
    Common inspirations:
    1. GitHub called, it needs one more unit testing util library.
    2. Google deprecated another API
    3. Sales said your product does something it doesn’t
    4. Last quarter your manager said you needed to work on setting
    more aggressive OKRs
    Right?

    View Slide

  8. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    THERE ARE
    TWO REASONS
    TO START WANTING TO CODE
    I want to
    make a
    video game.
    I want to
    be free of
    Excel.

    View Slide

  9. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Who came to build?
    Wait, how do we do that again?

    View Slide

  10. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    How do I (X) ?
    1. Testing
    2. Packaging
    3. Architecture
    4. Performance
    5. Documentation
    etc.
    In the Python community we get a lot of questions about:

    View Slide

  11. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    How you should learn (X)
    1. Stack Overflow answers
    2. Blog posts
    3. Video tutorials
    4. Tweet rants
    5. Conference talks
    etc.
    So the community responds with:

    View Slide

  12. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    THE X-Y PROBLEM
    We’re building it backwards.
    At scale!

    View Slide

  13. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    An Alternative
    1. Figure out what kind of application you’re building.
    2. Find other applications like that.
    3. Read and reuse.
    Case studies to complement your learning.

    View Slide

  14. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    AWESOME
    PYTHON
    APPLICATIONS

    View Slide

  15. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    screenshot

    View Slide

  16. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    1.
    AWESOME
    How does one list awesome?

    View Slide

  17. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Awesome™ Lists
    ○ Lists of links contributed by many
    ○ Back in my day, we called them wikis
    ○ Like wikis, they go stale
    ● One maintainer or many
    ● Curation activities get tedious
    ○ Mostly awesome resource hubs

    View Slide

  18. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    2.
    PYTHON
    The obvious choice.

    View Slide

  19. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Are you making
    the most of it?
    You’re part of the biggest, most-successful
    software platform ever.

    View Slide

  20. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    3.
    APPLICATIONS
    As opposed to?

    View Slide

  21. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Libraries
    ○ Developer-facing
    ○ Compiled (imported)
    ○ pip install
    LIBRARIES VS APPLICATIONS
    Applications
    ○ User-facing
    ○ Configured
    ○ Just… install.

    View Slide

  22. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    APA: More than just a README
    ○ Descriptions
    ○ Links
    ○ Tags
    ○ Structured YAML

    View Slide

  23. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    LET’S LOOK AT THE
    DATA
    Data last pulled:
    2019-08-17T05:52:44Z

    View Slide

  24. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    29 GB
    Even across 255 repos, that’s a lot of code!
    (Takes git, hg, and bzr about 5 hours to clone in parallel.)

    View Slide

  25. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    95%
    Of application repositories have commits in 2019.

    View Slide

  26. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    18,969,992
    Lines of code
    48,542
    Committers
    1,978,109
    Commits

    View Slide

  27. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    1.
    Architecture
    What hath FOSS wrought?

    View Slide

  28. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Application Architecture

    View Slide

  29. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Application Architecture Over Time

    View Slide

  30. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    A Graph About Python Ratio

    View Slide

  31. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    SPOTLIGHT:
    ganetti
    ○ 15,964 commits since 2007-07-16
    ○ Cluster management tool focused on
    long-lived VMs used for workloads
    without built-in redundancy
    ○ Developed at Google
    ○ Widely deployed, including at Wikimedia
    ○ 60% Python
    ○ 20% Haskell

    View Slide

  32. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    2.
    Dependencies
    Which shoulders/turtles are we standing on?

    View Slide

  33. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    GUI Frameworks

    View Slide

  34. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    GUI Frameworks Over Time

    View Slide

  35. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Server Frameworks

    View Slide

  36. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Server Frameworks Over Time

    View Slide

  37. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Concurrency

    View Slide

  38. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Concurrency over Time

    View Slide

  39. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Python Compatibility

    View Slide

  40. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Python Compatibility Over Time

    View Slide

  41. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Python 3 Compatibility

    View Slide

  42. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    3.
    Maintainability
    Coping with the commitment.

    View Slide

  43. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    92
    Median number of committers
    41%
    Of applications are mostly written by one committer
    43%
    Median top committer contribution

    View Slide

  44. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    SPOTLIGHT:
    edx-platform
    ○ 50,849 commits since 2011-12-07
    ○ Platform for massively open online
    courses, powering edx.org
    ○ Third-largest Django project on the APA
    ○ 300 committers
    ○ One of only 3 projects where no one
    developer has >10% of the commit
    history

    View Slide

  45. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    4.
    Licensing
    Spoiler alert: No one wrote their own.

    View Slide

  46. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Licenses

    View Slide

  47. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Licenses Over Time

    View Slide

  48. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Hereditary Licenses

    View Slide

  49. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Hereditary Licenses Over Time

    View Slide

  50. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    SPOTLIGHT:
    sentry
    ○ 26,399 commits since 2008-05-12
    ○ Web service and frontend for
    cross-platform application monitoring,
    with a focus on error reporting.
    ○ https://sentry.io/
    ○ 1,025,941 lines of Python
    ○ The largest FOSS Django project
    ● 1 million lines of Python
    ● (Including 120,000 vendored)
    ● Largest flask app is Pagure (110k)
    ○ BSD-3 Licensed

    View Slide

  51. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco

    View Slide

  52. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    5.
    Packaging
    Two years in the making.

    View Slide

  53. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    63%
    45 out of 71 GUI applications use freezers.

    View Slide

  54. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Freezers

    View Slide

  55. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Freezers Over Time

    View Slide

  56. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Containerization

    View Slide

  57. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Containerization Over Time

    View Slide

  58. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    SPOTLIGHT:
    OnionShare
    ○ 2,560 commits since 2014-05-20
    ○ Secure and anonymous file sharing over
    Tor services.
    ○ Linux, Windows, and Mac
    ○ Built on qt5
    ○ Ported from py2exe/py2app to
    PyInstaller

    View Slide

  59. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    SPOTLIGHT:
    The Median Python App
    ○ 3,500 commits since 2010-08-14
    ● 9 years old
    ○ Something related to communication or
    collaboration
    ○ 39 drive-by committers with 1 commit
    ○ Python 3.4+
    ○ 65% Python
    ○ Hereditary License (GPL, MPL, etc.)

    View Slide

  60. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Methods
    I didn’t make this all up, I swear.

    View Slide

  61. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    METHODS:
    apatite
    ○ 69 commits since 2019-08-05 (13 days ago)
    ○ CLI for managing and analyzing Awesome™ lists
    ○ Plugin support
    ● tokei for SLOC count
    ● go-license-detector for licenses
    ● vermin for minimum Python detection
    ○ Dozens of heuristics and lots of manual tagging
    ○ pandas for graphs (out of band)
    ○ https://github.com/mahmoud/apatite

    View Slide

  62. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    What next?
    We’re only getting started with our ecosystem.

    View Slide

  63. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    USING THE APA
    Build
    Reference the list for
    similar applications when
    building your application.
    2000+ years of
    maintenance.
    Cite
    Research your talk, blog
    post, or tweet for
    examples of patterns
    you’re trying to highlight.
    Recruit
    Not all developers have an
    idea for an original
    application or library,
    especially when just
    starting out.

    View Slide

  64. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    BUILDING THE APA
    Fix Bugs
    Apatite is far from
    complete. The potential for
    more metrics is limitless,
    but also:
    ○ CI / Auto-link checking
    (~660 APA links atm)
    ○ Project archiving
    ○ Static site generation
    Find Applications
    I’ve got a big list of
    sources for popular
    applications that needs
    trawling.
    Share
    Close the loop on FOSS
    development.

    View Slide

  65. Ask the Ecosystem - Aug 2019 - bit.ly/AskTheEco
    Questions?
    THANKS!
    github.com/mahmoud/awesome-python-applications
    github.com/mahmoud/apatite
    twitter.com/mhashemi
    sedimental.org
    yak.party

    View Slide