Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building and Linking Local, Regional, and National Cyberinfrastructure to Advance Science

SciTech
December 02, 2014

Building and Linking Local, Regional, and National Cyberinfrastructure to Advance Science

Louisiana researchers and universities have been involved in a concentrated, collaborative effort to advance state-wide cyberinfrastructure: computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks. This effort led to a set of interlinked projects that started making a significant difference to the state and created an environment that encouraged increased collaboration. Part of this environment included participation in the US National TeraGrid infrastructure, and US National networks. This talk describes the overall effort, the new projects and environment, the results, and the lessons learned.

SciTech

December 02, 2014
Tweet

More Decks by SciTech

Other Decks in Technology

Transcript

  1.     www.ci.anl.gov   www.ci.uchicago.edu   Building  and  Linking  Local,

     Regional,   and  Na6onal  Cyberinfrastructure  to   Advance  Science   Daniel  S.  Katz   [email protected]   Senior  Fellow,  Computa6on  Ins6tute,  University  of  Chicago  &  Argonne  Na6onal  Laboratory   Affiliate  Faculty,  Center  for  Computa6on  &  Technology,  Louisiana  State  University   Adjunct  Associate  Professor,  Electrical  and  Computer  Engineering,  LSU    
  2. www.ci.anl.gov   www.ci.uchicago.edu   2   Advancing  Science  through  CI

     –  [email protected]   Louisiana   •  Area: 134 382 km2 (33/51) •  Population: 4 533 000 (2010, 25/51) •  GDP: $208 billion (2009, 24/51) •  GDP/person: $45 700 (2009, 21/51) •  In Poverty: 17% (2009, 44/51) •  High School Degree: 82% (2009, 46/51) •  BS Degree: 21% (2009, 47/51) •  Advanced Degree: 7% (2009, 48/51) State  Goals:  talented  workforce,  great  compe66veness,  strong   educa6onal  system,  increased  economic  development  
  3. www.ci.anl.gov   www.ci.uchicago.edu   3   Advancing  Science  through  CI

     –  [email protected]   PITAC  Report  Summary:     •  “Computa6onal  science  -­‐-­‐  the  use  of   advanced  compu6ng  capabili6es  to   understand  and  solve  complex   problems  -­‐-­‐  is  cri6cal  to  scien6fic   leadership,  economic  compe66veness,   and  na6onal  security.  It  is  one  of  the   most  important  technical  fields  of  the   21st  century  because  it  is  essen6al  to   advances  throughout  society.”   •  “Universi6es  must  significantly  change   organiza6onal  structures:     mul6disciplinary  &  collabora6ve   research  are  needed  [for  US]  to  remain   compe66ve  in  global  science”   Complex  problems:    Innova1ons  will  occur  at  boundaries  
  4. www.ci.anl.gov   www.ci.uchicago.edu   4   Advancing  Science  through  CI

     –  [email protected]   Challenges  of  Complex  Problems   “Third  Pillar”  of  Comp.  Science   •  Applica6ons   •  Communi6es   –  No  single  group,   university,  or  state  can   do  these  problems   –  Must  integrate  CS,   Math,  Bio,  Sensors,   Engineering,  more...   •  Data  everywhere   –  Supercomputers   generate  petabytes  
  5. www.ci.anl.gov   www.ci.uchicago.edu   5   Advancing  Science  through  CI

     –  [email protected]   Cyberinfrastructure   •  Term  first  documented  in  1998   •  Cyberinfrastructure     –  compu6ng  systems   –  data  storage  systems   –  advanced  instruments  and  data  repositories   –  visualiza6on  environments   –  people   –  linked  together  by  soaware  and  high  performance  networks   –  to  improve  research  produc6vity  and  enable  breakthroughs   not  otherwise  possible.   o  Indiana  University  Cyberinfrastructure  Newsleber,  Craig  Stewart   •  Used  for  e-­‐Science,  e-­‐Research  
  6. www.ci.anl.gov   www.ci.uchicago.edu   6   Advancing  Science  through  CI

     –  [email protected]   Gravita6onal  Wave  Astronomy   •  LIGO:    Laser  Interferometric  Gravita6onal  Wave   Observatory   •  Ties  together  theory,  computa6on,  and  experiment   –  Each  drives  the  other  two!  
  7. www.ci.anl.gov   www.ci.uchicago.edu   9   Advancing  Science  through  CI

     –  [email protected]   How  We  Started   •  State  commitment:  $25M/year  for  Vision  20/20   –  $9M:  LSU  -­‐>  CCT  (similarly,  ULL  -­‐>  LITE)   •  University  commitment  to  build  new  programs  for   21st  century   •  State  and  University  willingness  to  make   extraordinary  investments   •  Opportunity  to  build  new  world  class  program  in   interdisciplinary  research  and  educa6on,  involving   all  of  LSU   •  Ed  Seidel-­‐led  vision  to  ins6gate  state-­‐wide   collabora6on  
  8. www.ci.anl.gov   www.ci.uchicago.edu   10   Advancing  Science  through  CI

     –  [email protected]   Advancing  Research   •  Poten6ally  requires  advances  in  three  areas,   depending  on  exis6ng  strengths  
  9. www.ci.anl.gov   www.ci.uchicago.edu   11   Advancing  Science  through  CI

     –  [email protected]   CCT Director Office Edward Seidel HPC Partnership McMahon Cyberinfrastructure Development
  10. www.ci.anl.gov   www.ci.uchicago.edu   12   Advancing  Science  through  CI

     –  [email protected]   Cyberinfrastructure  Development   •  Vision:  combine  research  and  infrastructure   –  Research   o  Computer  science   o  Applica6ons   o  Tools   •  Both  together  have  squared  growth  of  either   alone   •  CyD  staff  –  PhDs  in  CS  and  apps  who  understand   the  whole  picture  and  want  to  grow  the   ecosystem   12   –  Infrastructure   o  Hardware   o  Opera6ons   o  Policies  
  11. www.ci.anl.gov   www.ci.uchicago.edu   13   Advancing  Science  through  CI

     –  [email protected]   Na6onal  Lambda  Rail   UNO   Tulane   UL-­‐L   SUBR   LSU   LA  Tech       LONI:  40  Gbps  network   LONI:  ~100TF  IBM,  Dell   Supercomputers   Cybertools:  Tools  and   Services   Compu6ng  in  Louisiana   LONI  Ins6tute:  People   and  Collabora6ons   TeraGrid,  OSG  
  12. www.ci.anl.gov   www.ci.uchicago.edu   14   Advancing  Science  through  CI

     –  [email protected]   LONI  -­‐  Networking  &  Compu6ng   LSU La Tech LSU HSC ULL Tulane SU UNO LSU HSC LONI node Multiple 10GE ~500 core Dell cluster & 112 proc. IBM P5 cluster ~4500 core Dell Cluster ULM McNeese NSU SLU Alex Network:  partners  and  customers  
  13. www.ci.anl.gov   www.ci.uchicago.edu   15   Advancing  Science  through  CI

     –  [email protected]   LONI  Compu6ng  Resources  (2010)   •  One  central  Dell  cluster  (Queen  Bee)   –  5500  IB-­‐connected  cores  at  ISB  in  Baton  Rouge   –  Archival  storage  contracted  through  NCSA   –  50%  of  alloca6ons  dedicated  to  TeraGrid  from  2008       •  Six  distributed  512-­‐core  Dell  clusters   •  Five  distributed  14-­‐node  (112  procs)  IBM  P5-­‐575  clusters   •  Distributed  PetaShare  storage   –  32  TB  disk  @  each  small  Dell  cluster   –  8  TB  disk  on  LSU  &  LaTech  small  Dell  clusters  –  for  LBRN   –  8  TB  at  SC-­‐S  &  HSC-­‐NO  –  for  LBRN   –  250  TB  tape   •  All  run  by  HPC@LSU,  including  user  support/training  
  14. www.ci.anl.gov   www.ci.uchicago.edu   16   Advancing  Science  through  CI

     –  [email protected]   $12M  NSF  CyberTools  Project:  Enabler  and  Driver  
  15. www.ci.anl.gov   www.ci.uchicago.edu   17   Advancing  Science  through  CI

     –  [email protected]   Cactus   •  Component-­‐based     HPC  framework     –  Freely-­‐available     environment  for     collabora6ve  applica6on     development   •  Cuung  edge  CS   –  Grid  compu6ng,  petascale,  accelerators,  steering,  remote  viz   •  Ac6ve  user  &  developer  communi6es   –  10  year  pedigree,  >$10M  support   –  Numerical  Rela6vity,  CFD,  Coastal,  Reservoir  Engineering,  …   •  Domain-­‐specific  toolkits,  e.g.  CFD  toolkit   –  FD/FV/FE  numerical  methods   –  Structured,  mul6-­‐block,  unstructured   –  Uses  PETSc,  Trilinos,  MUMPS,  HYPRE   –  Used  to  build  Black  Oil  Toolkit  
  16. www.ci.anl.gov   www.ci.uchicago.edu   18   Advancing  Science  through  CI

     –  [email protected]   PetaShare   •  Main  concept:  data  is  managed  (migrated,  moved,  replicated,  cached,  etc.)     automa6cally   •  Data-­‐aware  storage  systems,  data-­‐aware  schedulers,  cross-­‐domain  metadata   scheme   •  Provides:  250  TB  disk,  400  TB  tape     storage  (and  access  to  na6onal     storage  facili6es)   •  Applica6ons:     coastal  &  environmental     modeling,     geospa6al  analysis,     bioinforma6cs,     medical  imaging,     fluid  dynamics,     petroleum  engineering,     numerical  rela6vity,     high  energy  physics.         Credit:  Tevfik  Kosar  
  17. www.ci.anl.gov   www.ci.uchicago.edu   19   Advancing  Science  through  CI

     –  [email protected]   LONI  Ins6tute     “CCT  for  the  Louisiana”   •  $15M  5-­‐year  project   –  $7M  BoR,  $8M  from  LaTech,  LSU,  SUBR,  Tulane,  UNO,   ULL   •  Catalyzes  new  inter-­‐ins6tu6onal  collabora6ons,   ambi6ous  projects  and  top  level  hires:   –  LONI  network  and  compu6ng   –  NSF  projects:    PetaShare,  VizTangibles,  TeraGrid,  Blue   Waters   –  EPSCoR:    NSF  CyberTools,  DOE  UCoMS,  DoD     –  NIH:  $17M  LBRN   –  Promote  collabora6ve  research  at  interfaces  for   innova6on  
  18. www.ci.anl.gov   www.ci.uchicago.edu   20   Advancing  Science  through  CI

     –  [email protected]   LONI  Ins6tute  Vision   •  LONI  investments  create  world  leading  infrastructure   •  Create  bold  new  inter-­‐university  superstructure   –  New  faculty,  staff,  students;    train  others.    Focus  on  CS,  Bio,   Materials,  but  all  disciplines  impacted   –  Promote  research  at  interfaces  for  innova6on   •  Draw  on,  enhance  strengths  of  all  universi6es   –  Strong  groups  recently  created;    collec6vely  world-­‐class   –  Solve  complex  problems  through  collabora6on  &  computa6on   –  Much  stronger  recrui6ng  opportuni6es  for  all  ins6tu6ons   –  Statewide  interdisciplinary  educa6on  &  research  program   •  Create  University-­‐Industry  Research  Centers  (UIRCs)   –  Research  Triangle,  NCSA/UIUC,  Bay  Area,  others   •  Transform  Louisiana   –  Such  commibed  coopera6on  between  sites  extraordinary  
  19. www.ci.anl.gov   www.ci.uchicago.edu   21   Advancing  Science  through  CI

     –  [email protected]   LONI  Ins6tute  Hiring  and  Projects   •  Two  new  faculty  at  each  ins6tu6on  (12  total)   –  Six  in  CS,  six  in  Comp.  Bio/Materials   •  Six  Computa6onal  Scien6sts   –  Following  Bavarian  KONWIHR  project   –  Support  70-­‐90  projects  over  five  years;  lead  to  external  funding   •  Graduate  students   –  36  new  students  funded,  trained;  two  years  each   •  One  Coordinator/economic  development   •  All  hiring  coordinated  across  state   •  Leading  faculty  across  state  create  mul6-­‐ins6tu6onal  seed   projects   •  Building  on  seeds,  dozens  of  new  projects  selected,  started   •  Exploit  common  themes,  compu6ng  environments,  tools   found  in  all  areas  
  20. www.ci.anl.gov   www.ci.uchicago.edu   22   Advancing  Science  through  CI

     –  [email protected]   TeraGrid  (XSEDE)   •  TeraGrid:  world’s  largest  open  scien6fic  discovery  infrastructure   •  Leadership  class  resources  at  eleven  partner  sites  combined  to  create   an  integrated,  persistent  computa6onal  resource   –  High-­‐performance  networks   –  High-­‐performance  computers  (>1  Pflops  (~100,000  cores)  -­‐>  1.75  Pflops)   o  And  a  Condor  pool  (w/  ~13,000  CPUs)   –  Visualiza6on  systems   –  Data  Collec6ons  (>30  PB,  >100  discipline-­‐specific  databases)   –  Science  Gateways   –  User  portal   –  User  services  -­‐  Help  desk,  training,  advanced  app  support   •  Allocated  to  US  researchers  and  their  collaborators  through  na6onal   peer-­‐review  process   –  Generally,  review  of  compu6ng,  not  science   •  Mid  2011:  TeraGrid  -­‐-­‐>  XSEDE  
  21. www.ci.anl.gov   www.ci.uchicago.edu   23   Advancing  Science  through  CI

     –  [email protected]   Campus  Champions   •  “Champion”  is  a  staff  or  faculty  member  on  a  campus  that  provides  informa6on  on   XSEDE  to  his/her  colleagues   •  Currently  114  ins6tu6ons  represented  by  champions   •  Receive  training  and  support  from  XSEDE  staff   Credit:  Scob  Lathrop  (9/2011)  
  22. www.ci.anl.gov   www.ci.uchicago.edu   24   Advancing  Science  through  CI

     –  [email protected]   LONI  and  Na6onal  Cyberinfrastructure   •  TeraGrid   –  One  of  the  11  TeraGrid  Resource  Providers   –  Playing  a  role  in  TG-­‐wide  governance  (TeraGrid  Forum,  Execu6ve   Steering  Commibee,  various  working  groups,  GIG  Director  of   Science)   –  Contributed  administra6ve  soaware  AmieGold  (glue  between  TG   account  info  and  local  info)  and  CS  soaware  (HARC,  PetaShare,   SAGA)   •  OSG   –  Currently  providing  resources   •  XSEDE   –  LONI  not  a  partner  in  XSEDE,  but  a  service  provider   •  Na6onally   –  Bringing  in  new  users  from  the  southeast  US   –  LONI  Ins6tute  Computa6onal  Scien6sts  -­‐>    Campus  Champions  
  23. www.ci.anl.gov   www.ci.uchicago.edu   25   Advancing  Science  through  CI

     –  [email protected]   Recap  (to  2010)   •  Louisiana  decides  that  science  and  technology   can  lead  to  a  beber  future   •  Builds  a  regional  cyberinfrastructure  (network,   compu6ng,  soaware,  ~data,  people)  that   connects  to  na6onal-­‐scale  infrastructure     •  Starts  to  change  culture  –  infuse  computa6on  in   academic  departments,  interdisciplinary  hiring,   large  collabora6ve  projects  
  24. www.ci.anl.gov   www.ci.uchicago.edu   26   Advancing  Science  through  CI

     –  [email protected]   Lessons   •  Three  triangle  facets  (infrastructure,  computa6onal,  interdisciplinary)  have   be  taken  seriously  at  highest  levels,  seen  as  important  component  of   academic  research   •  Infrastructure  need  to  be  integrated  at  all  levels  (laboratory,  campus,   regional,  na6onal,  interna6onal)  –  users  need  to  be  able  to  easily  move  work   and  data  to  appropriate  systems,  and  collaborate  across  loca6ons     •  Educa6on  and  training  of  students  and  faculty  is  crucial  –  vast  improvements   are  needed  over  the  small  numbers  currently  reached  through  HPC  center   tutorials;  computa6on  and  computa6onal  thinking  need  to  be  part  of  new   curricula  across  all  disciplines     •  Emphasis  should  be  made  on  broadening  par6cipa6on  in  computa6on,  not   just  focusing  on  high  end  systems  where  decreasing  numbers  of  researchers   can  join  in,  but  making  tools  much  more  easily  usable  and  intui6ve  and   freeing  all  researchers  from  the  limita6ons  of  their  personal  worksta6ons,   and  providing  access  to  simple  tools  for  large  scale  parameter  studies,  data   archiving,  visualiza6on  and  collabora6on   •  Vision  needs  to  be  consistent  –  cannot  be  just  one  person   •  Funding  needs  to  be  stable  (ac6vi6es  need  to  be  sustainable)  
  25. www.ci.anl.gov   www.ci.uchicago.edu   27   Advancing  Science  through  CI

     –  [email protected]   Sources   •  D.  S.  Katz  et  al.,  “Louisiana:  A  Model  for  Advancing  Regional  e-­‐Science   through  Cyberinfrastructure,”  Philosophical  Transac6ons  of  the  Royal   Society  A,  367(1897),  2009.   –  authors  from  Louisiana  State  University,  Tulane  University,  University  of   Louisiana  at  Lafayebe,  Louisiana  Tech  University,  Louisiana  Community   and  Technical  College  System,  Southern  University,  University  of  New   Orleans   •  G.  Allen  and  D.  S.  Katz,  “Computa6onal  science,  infrastructure  and   interdisciplinary  research  on  university  campuses:  experiences  and   lessons  from  the  Center  for  Computa6on  and  Technology,”  NSF   Workshop  on  Sustainable  Funding  and  Business  Models  for  Academic   Cyberinfrastructure  Facili6es,  Cornell  University,  2010   •  In  addi6on,  this  work  impacted  later  thinking  in:  Daniel  S.  Katz,  David   Proctor,  “A  Framework  for  Discussing  e-­‐Research  Infrastructure   Sustainability,”  hbp://dx.doi.org/10.6084/m9.figshare.790767,   submibed  to  Workshop  on  Sustainable  Soaware  for  Science:  Prac6ce   and  Experiences  (hbp://wssspe.researchcompu6ng.org.uk)  at  SC13