$30 off During Our Annual Pro Sale. View Details »

How Open Source Software Supports the Largest Computers on the Planet

Ian Lee
July 18, 2018

How Open Source Software Supports the Largest Computers on the Planet

This talk provides an overview of the work going on at Lawrence Livermore National Laboratory to re-vamp our open source project offerings, release processes, and engagements across the Department of Energy and the US Government through efforts such as DOECode and Code.gov. We will also discuss on going work to make it easier for our staff to engage with open source communities; both via the creation of new projects, and contributing to existing open source projects.

Ian Lee

July 18, 2018
Tweet

More Decks by Ian Lee

Other Decks in Technology

Transcript

  1. LLNL-PRES-754800
    This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory
    under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC
    How Open Source Supports
    the Largest Computers on the Planet
    Best Practices for HPC Software Developers
    Ian Lee
    Lawrence Livermore National Laboratory
    July 18, 2018

    View Slide

  2. LLNL-PRES-754800
    2
    software.llnl.gov

    View Slide

  3. LLNL-PRES-754800
    3
    software.llnl.gov

    View Slide

  4. LLNL-PRES-754800
    4
    software.llnl.gov
    https://upload.wikimedia.org/wikipedia/commons/a/a8/U.S._National_labs_map.jpg

    View Slide

  5. LLNL-PRES-754800
    5
    software.llnl.gov
    http://www.ex-astris-scientia.org/articles/new_enterprise/enterprise-warpcore.jpg

    View Slide

  6. LLNL-PRES-754800
    6
    software.llnl.gov
    https://pixabay.com/get/e833b10d2af4083ed1534705fb0938c9bd22ffd41db612439df7c17ba0/silos-1602209_1920.jpg

    View Slide

  7. LLNL-PRES-754800
    7
    software.llnl.gov
    1960s 1970s 1980s 1990s 2000s 2010s
    Pioneering
    simulations of
    particle tracking
    CDC 3600
    CDC 7600
    Ozone mixing
    models
    CRAY 1
    ASCI Blue-
    Pacific
    Helping the medical
    community plan
    radiation treatment
    Unprecedented
    dislocation dynamics
    simulations
    BlueGene
    Breakthrough
    visualizations of
    mixing fluids
    Dynamics in three
    dimensions
    Global climate
    modeling
    Detailed
    predictions
    of ecosystems
    Petascale
    and exascale
    computing

    View Slide

  8. LLNL-PRES-754800
    8
    software.llnl.gov
    § 3 out of 16 #1 systems over last 20 years
    Top500.org
    ASCI White
    Nov 2000 – Nov 2001
    BlueGene/L
    Nov 2004 – Nov 2007
    Sequoia
    June 2012
    https://www.top500.org/resources/top-systems/

    View Slide

  9. LLNL-PRES-754800
    9
    software.llnl.gov
    Sierra

    View Slide

  10. LLNL-PRES-754800
    10
    software.llnl.gov
    ZFS on Linux
    § ZFS is an open source filesystem and volume
    manager designed to address the limitations
    of existing storage solutions
    § 2011: Available for Linux
    § Ten LLNL filesystems, totaling ~ 100PB
    § Ships in Ubuntu 16.04
    http://zfsonlinux.org

    View Slide

  11. LLNL-PRES-754800
    11
    software.llnl.gov

    View Slide

  12. LLNL-PRES-754800
    12
    software.llnl.gov

    View Slide

  13. LLNL-PRES-754800
    13
    software.llnl.gov

    View Slide

  14. LLNL-PRES-754800
    14
    software.llnl.gov

    View Slide

  15. LLNL-PRES-754800
    15
    software.llnl.gov

    View Slide

  16. LLNL-PRES-754800
    16
    software.llnl.gov

    View Slide

  17. LLNL-PRES-754800
    17
    software.llnl.gov

    View Slide

  18. LLNL-PRES-754800
    18
    software.llnl.gov

    View Slide

  19. LLNL-PRES-754800
    19
    software.llnl.gov
    https://software.llnl.gov

    View Slide

  20. LLNL-PRES-754800
    20
    software.llnl.gov
    LLNL Open Source Presence
    https://software.llnl.gov/explore

    View Slide

  21. LLNL-PRES-754800
    21
    software.llnl.gov
    LLNL Open Source Engagement
    https://software.llnl.gov/explore

    View Slide

  22. LLNL-PRES-754800
    22
    software.llnl.gov
    LLNL Open Source Activities
    https://software.llnl.gov/explore

    View Slide

  23. LLNL-PRES-754800
    23
    software.llnl.gov

    View Slide

  24. LLNL-PRES-754800
    24
    software.llnl.gov
    Science & Technology Review
    “Our large collection of software is a
    precious Laboratory asset, one that
    benefits both Lawrence Livermore, and in
    many cases, the public at large.”
    - Bruce Hendrickson
    Associate Director, Computation
    https://str.llnl.gov/2018-01/comjan18

    View Slide

  25. LLNL-PRES-754800
    25
    software.llnl.gov
    https://www.exascaleproject.org/more-on-the-software-that-underpins-the-exascale-computing-project/

    View Slide

  26. LLNL-PRES-754800
    26
    software.llnl.gov
    § “Federal Source Code Policy: Achieving Efficiency, Transparency, and Innovation
    through Reuseable and Open Source Software”
    — “Agencies shall make custom-developed code available for Government-wide reuse and
    make their code inventories discoverable at https://www.code.gov (“Code.gov”) […]”
    — “[…] establishes a pilot program that requires agencies, when commissioning new custom
    software, to release at least 20 percent of new custom-developed code as Open Source
    Software (OSS) […]”
    Federal Source Code Policy
    https://sourcecode.cio.gov
    https://code.gov & https://sourcecode.cio.gov

    View Slide

  27. LLNL-PRES-754800
    27
    software.llnl.gov
    https://code.gov

    View Slide

  28. LLNL-PRES-754800
    28
    software.llnl.gov
    https://osti.gov/doecode

    View Slide

  29. LLNL-PRES-754800
    29
    software.llnl.gov
    https://government.github.com

    View Slide

  30. LLNL-PRES-754800
    30
    software.llnl.gov
    US Government Organizations on GitHub
    https://government.github.com/community/

    View Slide

  31. Thank You!
    [email protected]
    @IanLee1521 // @LLNL_OpenSource
    https://speakerdeck.com/IanLee1521

    View Slide

  32. This document was prepared as an account of work sponsored by an agency of the United States
    government. Neither the United States government nor Lawrence Livermore National Security, LLC,
    nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability
    or responsibility for the accuracy, completeness, or usefulness of any information, apparatus,
    product, or process disclosed, or represents that its use would not infringe privately owned rights.
    Rference herein to any specific commercial product, process, or service by trade name, trademark,
    manufacturer, or otherwise does not necessarily constitute or imply its endorsement,
    recommendation, or favoring by the United States government or Lawrence Livermore National
    Security, LLC. The views and opinions of authors expressed herein do not necessarily state or
    reflect those of the United States government or Lawrence Livermore National Security, LLC, and
    shall not be used for advertising or product endorsement purposes.

    View Slide

  33. LLNL-PRES-754800
    33
    software.llnl.gov
    TOSS – Tri-Lab Operating System Software
    § Built on Red Hat Enterprise Linux
    — Not an HPC distribution
    § Adds LLNL developed additions and
    patches to support HPC
    — Low Latency Interconnect: Infiniband
    — Parallel File System: Lustre
    — Resource Manager: SLURM
    § Work closely with open communities
    Components
    not in TOSS
    Supported Linux Commodity Hardware Platform
    Kernel, Infiniband, Message Passing Interface
    Batch Scheduler (MOAB)
    User
    Environment
    Lustre
    File Systems
    Compiler &
    Development Tools
    Resource Manager (SLURM)
    TOSS
    Components
    HPSS Hopper
    LLNL-PRES-550311
    TOSS is a software stack for HPC – large, interconnected clusters!

    View Slide

  34. LLNL-PRES-754800
    34
    software.llnl.gov
    § Began as simple resource manager
    — Now scalable to 1.6M+ cores (sequoia)
    § Launch and manage parallel jobs
    — Large, parallel jobs, often MPI
    § Queuing and scheduling of jobs
    — Much more work than resources
    http://slurm.schedmd.com
    http://slurm.schedmd.com
    http://www.ibm.com/developerworks/library/l-slurm-utility/figure3.gif

    View Slide

  35. LLNL-PRES-754800
    35
    software.llnl.gov
    § Family of projects used to build site-customized resource management systems
    § flux-core
    — Implements the communication layer and lowest level services and interfaces
    § flux-sched
    — Consists of an engine that handles all the functionality common to scheduling
    § capacitor
    — A bulk execution manager using flux-core, handles running and monitoring 1000’s of jobs
    http://flux-framework.github.io

    View Slide

  36. LLNL-PRES-754800
    36
    software.llnl.gov
    § Handles combinatorial explosion of
    ABI-incompatible packages
    § All versions coexist, binaries work
    regardless of user’s environment
    § Familiar syntax, reminiscent of brew, yum, etc
    $ spack install mpileaks unconstrained
    $ spack install [email protected] @ custom version
    $ spack install [email protected] %[email protected] % custom compiler
    $ spack install [email protected] %[email protected] +threads +/- build option
    $ spack install [email protected] os=SuSE11 os=
    $ spack install [email protected] os=CNL10 os=
    $ spack install [email protected] os=CNL10 target=haswell target=
    SPACK
    https://spack.io

    View Slide

  37. LLNL-PRES-754800
    37
    software.llnl.gov
    § Manages the first-ever decentralized
    database for handling climate science data
    § Multiple petabytes of data at dozens of
    federated sites worldwide
    § International collaboration for the
    software that powers most global
    climate change research
    https://github.com/ESGF
    https://esgf.llnl.gov

    View Slide

  38. LLNL-PRES-754800
    38
    software.llnl.gov
    VisIt
    § Originally developed to visualize and
    analyze the results of terascale
    simulations
    § Interactive, scalable, visualization,
    animation and analysis tool
    § Powerful, easy to use GUI
    § Distributed and parallel architecture
    allows handling extremely large data
    sets interactively
    https://visit.llnl.gov

    View Slide

  39. LLNL-PRES-754800
    39
    software.llnl.gov
    https://computation.llnl.gov/casc

    View Slide

  40. LLNL-PRES-754800
    40
    software.llnl.gov
    https://code.gov/#/explore-code/agencies/DOE

    View Slide

  41. LLNL-PRES-754800
    41
    software.llnl.gov
    Public US Government GitHub Data Scrape
    § 252 US Government Orgs
    — U.S. Federal (137)
    — U.S. Military and Intelligence (12)
    — U.S. Research Labs (103)
    § 8716 Open Source Repositories
    https://github.com/LLNL/scraper/pull/3
    LLNL
    5%
    Other US
    Governm
    ent
    95%

    View Slide