Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Integration to Improve Accessibility to Public High-throughput Sequencing Data

Database Integration to Improve Accessibility to Public High-throughput Sequencing Data

A Presentation at National Institute of Genetics, Japan Retreat 2014

Tazro Inutano Ohta

July 04, 2014
Tweet

More Decks by Tazro Inutano Ohta

Other Decks in Science

Transcript

  1. 1: Reliability Data should be archived correctly, with explicit metadata

    2: Accessibility Data should be able to be accessed by anyone, without special trick
  2. 1: Reliability needs curation Data should be archived correctly, with

    explicit metadata 2: Accessibility needs good interface Data should be able to be accessed by anyone, without special trick
  3. 1: Reliability needs curation Data should be archived correctly, with

    explicit metadata 2: Accessibility needs good interface Data should be able to be accessed by anyone, without special trick
  4. 1: Reliability needs curation Data should be archived correctly, with

    explicit metadata 2: Accessibility needs good interface Data should be able to be accessed by anyone, without special trick
  5. ???

  6. Publications can have details of seq process, Seq Read Quality

    can be a source of data quality. DDBJ Read Archive PubMed PMC Extracted Read Quality
  7. 83% seq reads satisfied average quality over 30 0.03% of

    seq reads fall into over 50% N content
  8. 1: Reliability from paper/data qual more description brings more proof.

    2: Accessibility from text-search Search included publication brings flexibility.
  9. 1: Beyond Raw Data Archive is going to handle alignment

    data. 2: Analysis Reproducibility Public repo for analysis pipeline is required.
  10. 1: Beyond Raw Data Archive is going to handle alignment

    data. 2: Analysis Reproducibility Public repo for analysis pipeline is required.