$30 off During Our Annual Pro Sale. View Details »

Development of New Bioinformatics Courses for Biologists

Barry Grant
September 22, 2017

Development of New Bioinformatics Courses for Biologists

UCSD 2017 Summer Fellowship Progress Report

Barry Grant

September 22, 2017
Tweet

More Decks by Barry Grant

Other Decks in Education

Transcript

  1. 2017 Summer Fellowship
    Progress Report
    Barry Grant
    http://thegrantlab.org/bggn213

    View Slide

  2. One month of summer support was requested
    for the development of a new bioinformatics
    laboratory course for biologists.

    View Slide

  3. This included the development of:

    1. Specific learning outcomes for BGGN-213 and
    BIMM-143 in collaboration with invested faculty.

    2. Open learning resources, including public websites,
    web-apps and video screencasts supporting these
    outcomes.

    View Slide

  4. This included the development of:

    1. Specific learning outcomes for BGGN-213 and
    BIMM-143 in collaboration with invested faculty.

    2. Open learning resources, including public websites,
    web-apps and video screencasts supporting these
    outcomes.

    View Slide

  5. Side Note: I joined UCSD on July 6th and I
    received a ‘loaner’ office chair and desk from
    BKM on Sept 13th (last Wednesday).

    View Slide

  6. This included the development of:

    1. Specific learning outcomes for BGGN-213 and
    BIMM-143 in collaboration with invested faculty.

    2. Open learning resources, including public websites,
    web-apps and video screencasts supporting these
    outcomes.

    View Slide

  7. Develop a set of 20 specific learning goals for BGGN-213, of
    which 16 will be shared with BIMM-143.

    • The major difference is that we will delve into more
    advanced UNIX and R programming supporting “big-data”
    analysis in BGGN-213.

    • Successfully applied for a computing grant from NSF/
    XSEDE (eXtreme Science and Engineering Development
    Environment). This provides us with 17,800 service units
    (equivalent to $3,007) of cloud-based supercomputing
    resources for Fall 2017.

    • We plan to use AWS for BIMM-143, utilizing the Amazon/
    UCSD agreed allocation of $50 per student.

    View Slide

  8. http://thegrantlab.org/bggn213/

    View Slide

  9. http://thegrantlab.org/bggn213/

    View Slide

  10. http://thegrantlab.org/ucsd/
    What essential concepts and skills should
    students attain from this course?
    To Update!

    View Slide

  11. At the end of this course students will:
    • Understand the increasing necessity for computation
    in modern life sciences research.
    • Be able to use and evaluate online bioinformatics
    resources and analysis tools to solve problems in the
    biological sciences.
    • Be able to use the UNIX command line and the R
    environment to analyze bioinformatics data at scale.
    • Be familiar with the research objectives of the
    bioinformatics related sub-disciplines of Genome
    informatics, Transcriptomics and Structural
    informatics.

    View Slide

  12. In short, students will develop a solid foundational
    knowledge of bioinformatics and be able to
    evaluate new biomolecular and genomic information
    using existing bioinformatic tools and resources.

    View Slide

  13. Specific Learning Goals….
    What I want students to know by course end!

    View Slide

  14. Course Structure
    http://thegrantlab.org/ucsd/
    Derived from specific learning goals
    To Update!

    View Slide

  15. Course Structure
    http://thegrantlab.org/ucsd/
    Derived from specific learning goals
    To Update!

    View Slide

  16. Class Details
    Goals, Class material, Screencasts & Homework

    View Slide

  17. Homework
    Goals, Class material, Screencasts & Homework

    View Slide

  18. Homework
    Goals, Class material, Screencasts & Homework

    View Slide

  19. Homework
    Goals, Class material, Screencasts & Homework

    View Slide

  20. Homework
    Goals, Class material, Screencasts & Homework
    Homework is due before the next week’s class!

    View Slide

  21. BGGN-213 Learning Goals….
    Advanced UNIX and R based learning goals

    View Slide

  22. BGGN-213 Learning Goals….
    Delve deeper into “real-world” bioinformatics

    View Slide

  23. These support a major learning objective
    At the end of this course students will:
    • Understand the increasing necessity for computation in
    modern life sciences research.
    • Be able to use and evaluate online bioinformatics
    resources and analysis tools to solve problems in the
    biological sciences.
    • Be able to use the UNIX command line and the R
    environment to analyze bioinformatics data at scale.
    • Be familiar with the research objectives of the
    bioinformatics related sub-disciplines of Genome
    informatics, Transcriptomics and Structural informatics.

    View Slide

  24. How do we actually do Bioinformatics?
    Pre-packaged tools and databases
    • Many online
    • Most are free to use
    • Time consuming methods require downloading…
    Advanced tool application & development
    • Mostly on a UNIX environment
    • Knowledge of programing languages frequently required
    (e.g. R, Python, Perl, C, Java, Fortran)
    • May require specialized high performance computing…

    View Slide

  25. How do we actually do Bioinformatics?
    Pre-packaged tools and databases
    • Many online
    • Most are free to use
    • Time consuming methods require downloading…
    Advanced tool application & development
    • Mostly on a UNIX environment
    • Knowledge of programing languages frequently required
    (e.g. R, Python, Perl, C, Java, Fortran)
    • May require specialized high performance computing…
    ?

    View Slide

  26. NSF Extreme Science and Engineering
    Discovery Environment (XSEDE)

    View Slide

  27. XSEDE Proposal for Jetstream Resources
    XSEDE Educational Allocation Resource Justification
    Title: Teaching Bioinformatics to Biologists at UC San Diego

    PI: Dr. Barry J Grant ([email protected])

    University of California, San Diego

    9/21/2017

    Overview:

    XSEDE Jetstream resources are requested to support the teaching of a new
    bioinformatics graduate course for biologists at UC San Diego. This course,
    BGGN-213 ("Foundations of Bioinformatics") provides a hands-on introduction to the
    computer-based analysis of genomic and biomolecular data. Major topics include:
    Genome informatics, Structural informatics, Transcriptomics, UNIX for bioinformatics,
    and Bioinformatics data analysis with R. Full course details are available at: < https://
    bioboot.github.io/bggn213_f17/ >

    Critical Need:
    Modern biomedical research is generating ever increasing quantities of complex
    biological data. As the rate of this data generation continues to outpace the rate at
    which biologists are able to analyze these data, there is a critical need for new
    bioinformatics training to help the next generation of biologists drive the collection and
    analysis of this “big data revolution” in the biosciences.

    Why XSEDE?
    The Division of Biological Sciences at UC San Diego has no suitable UNIX compute
    server to use for this course. Limiting students to their own laptops or departmental
    desktop windows machines will severely limit the scope and utility of this course.
    Access to XSEDE Jetstream resources will enable students to learn and gain
    proficiency in modern bioinformatics workflows and best practices for reproducible
    research on todays large genomic and biomolecular datasets.

    Resources Requested:
    Between 24 and 30 students will require an estimated maximum of 17,800 Service
    Units. Students will use these resources from week 5 of the course onward (10/12/17
    to 12/12/17).

    A maximum of 32 Virtual Machines and associated public IP addresses (for SSH
    access) will be required along with 2TB of storage space total (students will download
    and store several eukaryotic genomes along with several small molecule and protein
    structure datasets).
    BIOGRAPHICAL SKETCH: BARRY J. GRANT
    A. Professional Preparation
    • Queen’s University of Belfast, UK Biochemistry B.Sc. (1999)
    • University of York, UK Bioinformatics M.Res. (2000)
    • University of York, UK Chemistry Ph.D. (2005)
    • University of California, San Diego Biophysics Postdoc (2005-2009)
    B. Appointments
    • Assistant Professor Division of Biological Sciences (2017-present)
    University of California, San Diego, CA.
    • Assistant Professor Department of Computational Medicine & Bioinformatics (2011-2017)
    University of Michigan, Ann Arbor, MI.
    • Bioinformatics Specialist (Senior Scientist) Howard Hughes Medical Institute (2009-2011)
    University of California, San Diego, CA.
    • Bioinformatics Scientist deCODE Genetics Inc., Reykjavik, Iceland. (2000)
    C. Publications
    Note. Complete bibliography and full-text options available from: http://thegrantlab.org/publications/
    Publications closely related to project
    • Yao XQ, Skjaerven L, Grant BJ. Rapid characterization of allosteric networks with ensemble normal
    mode analysis. J Phys Chem B. 2016. DOI: 10.1021/acs.jpcb.6b019912016.
    • Yao XQ, Malik RU, Griggs NW, Skjaerven L, Traynor JR, Sivaramakrishnan S, Grant BJ. Dynamic
    coupling and allosteric networks in the alpha subunit of heterotrimeric G proteins. J Biol Chem.
    2016;291(9):4742-53. PMCID: 4813496.
    • Scarabelli G, Soppina V, Yao XQ, Atherton J, Moores CA, Verhey KJ, Grant BJ. Mapping the
    processivity determinants of the kinesin-3 motor domain. Biophys J. 2015;109(8):1537-40. PMCID:
    4624112.
    • Scarabelli G, Grant BJ. Mapping the structural and dynamical features of kinesin motor domains.
    PLoS Comput Biol. 2013;9(11):e1003329. PMCID: 3820509.
    • Grant BJ, Gheorghe DM, Zheng W, Alonso M, Huber G, Dlugosz M, McCammon JA, Cross RA.
    Electrostatically biased binding of kinesin to microtubules. PLoS Biol. 2011;9(11):e1001207. PMCID:
    3226556.
    Other significant publications
    • Skjaerven L, Jariwala S, Yao XQ, Grant BJ. Online interactive analysis of protein structure
    ensembles with Bio3D-web. Bioinformatics. 2016; (in press).
    • Skjaerven L, Yao XQ, Scarabelli G, Grant BJ. Integrating protein structural dynamics and
    evolutionary analysis with Bio3D. BMC Bioinformatics. 2014;15:399. PMCID: 4279791.
    • Scarabelli G, Grant BJ. Kinesin-5 allosteric inhibitors uncouple the dynamics of nucleotide,
    microtubule, and neck-linker binding sites. Biophys J. 2014;107(9):2204-13. PMCID: 4223232.
    • Yao XQ, Grant BJ. Domain-opening and dynamic coupling in the alpha-subunit of heterotrimeric G
    proteins. Biophys J. 2013;105(2):L08-10. PMCID: 3714883.
    • Grant BJ, Rodrigues AP, ElSawy KM, McCammon JA, Caves LS. Bio3D: An R package for the
    comparative analysis of protein structures. Bioinformatics. 2006;22(21):2695-6.
    D. Selected Synergistic Activities
    • Excellence in Basic Science Teaching Award, Computational Medicine and Bioinformatics,
    University of Michigan (2013).
    Awarded 17,800 SUs (equivalent to $3,007)

    View Slide

  28. What is Jetstream?
    • A new cloud computing environment based at Indiana
    University and the Texas Advanced Computing Center
    (TACC) providing on-demand access to interactive
    computing and data analysis resources.

    View Slide

  29. Jetstream tutorials
    Developed user friendly labs for Jetstream basics

    View Slide

  30. Jetstream tutorials
    Developed user friendly labs for Jetstream basics

    View Slide

  31. Jetstream tutorials
    Developed user friendly labs for Jetstream basics

    View Slide

  32. Basics File Control
    Viewing &
    Editing
    Files
    Misc.
    useful
    Power
    commands
    Process
    related
    ls mv less chmod grep top
    cd cp head echo find ps
    pwd mkdir tail wc sed kill
    man rm nano curl uniq Crl-c
    ssh
    |
    (pipe)
    touch source git Crl-z
    >
    (write to file)
    cat R bg
    <
    (read from file)
    tmux python fg

    View Slide

  33. Jetstream tutorials
    R & RStudio running remotely on Jetstream :-)

    View Slide

  34. This included the development of:

    1. Specific learning outcomes for BGGN-213 and
    BIMM-143 in collaboration with invested faculty.

    2. Open learning resources, including public websites,
    video screencasts and web-apps supporting these
    outcomes.

    View Slide

  35. Pre-class Screencast Videos
    Addressing variability in student background knowledge
    • Laptop (with webcam and microphone),

    • Blue screen/Green screen,

    • ScreenFlow software,

    • Lots of patience.

    View Slide

  36. Pre-class Screencast Videos
    Addressing variability in student background knowledge

    View Slide

  37. Partnering with QUBES
    • Currently a group of 7 faculty from around the
    US interested in developing video tutorials for
    computational genomics education.
    • Basically a mentoring network started by Hong
    Qin (Associate Professor of Computer Science
    and Biology, University of Tennessee-
    Chattanooga), who has developed hundreds of
    YouTube educational videos.
    • Thus far we have had three virtual meetings and
    have one NSF grant proposal in embryonic form.

    View Slide

  38. Prototype Web Apps

    View Slide

  39. • Consulted with invested faculty to develop a set of 20 specific learning
    goals for BGGN-213 of which 16 will be shared and adapted for
    BIMM-143.

    • Obtained NSF/XSEDE (eXtreme Science and Engineering Development
    Environment) cloud-based computing grant to support BGGN-213.

    • Published http://thegrantlab.org/bggn213/ with all open online
    bioinformatics teaching materials and joined NIBLSE (Network for
    Integrating Bioinformatics into Life Science Education).

    • Developed an initial set of video screencasts for BGGN-213 and joined
    the QUBSE project for developing video tutorials for computational
    genomics education.

    • Developed local web server infrastructure and prototyped interactive
    web apps for teaching.
    Summary

    View Slide