Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Identifiers and Open Science / JATS-Con-Asia-20...

Identifiers and Open Science / JATS-Con-Asia-20151019-01-hideaki-takeda

JATS-Con Asia
Monday, October 19, 2015
http://jats-con-asia.strikingly.com/

Keynote Speaker 1 "Identifiers and Open Science"
-Hideaki Takeda, National Institute of Informatics (Board of Directors, ORCID)
Abstract: http://jats-con-asia.strikingly.com/#speakers
Materials: https://speakerdeck.com/jatsconasiasc/jats-con-asia-20151019-01-hideaki-takeda
Video: https://vimeo.com/150207019

More Decks by JATS-Con Asia Steering Committee

Other Decks in Technology

Transcript

  1. So Science is becoming Open Science • Open science can

    be discussed in philosophical, political, methodological, or any kind of views. • “Open Science NOW” is geared and realized by Internet as Architecture • So data sharing is the core of Open Science
  2. 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100 0011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010110111 1100011110000110101010101110101101110101101111100011110000110101010101111100011110000110101010101 1101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001 1010101010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110

    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 01010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 01010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101110001111000011010101010111010110111110001111000011010101010111111000011010101011111110011100 001101010101001101010101000011010101010 Researcher in Future Data Data use Data publishing Integration of papers & data Data publishing Research = Data Supply-chain
  3. Data Life Cycle • Data is created, shared, published, and

    archived • But, just “published” is not enough, it should be “openly published” (open data) Data Share Create Publish Archive Research Phase In Progress Results
  4. Open Data • “A piece of data or content is

    open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share- alike.” http://opendefinition.org/ • Open data is data publication with some open license – Open license ensues the above condition
  5. Data Life Cycle • Different tools for different stages of

    life cycle – Data sharing: generating, federating, … – Data publishing: searching, harvesting, … – Data archiving: migration, … • The architecture CAN be shared Data Share Create Publish Preserve Research Phase In Progress Results Stakeholder Research Institute Researcher/R. Group
  6. Four reasons for openness of research data • Demands from

    Society – Knowledge sharing among society – Accountability of public money • Demands in Science – Future development of Science itself • “Standing on the shoulders of giants” (nanos gigantum humeris insidentes) – Reproducibility
  7. Dimensions of Science Local Global Authorized Open Government University Publisher

    Citizen Science/Open Science Various stakeholders stand on different positions on Open Science
  8. Repository Architecture of data sharing Identifier Data Format Metadata Metadata

    Schema Systematic Integration across the layers Interoperability on each layer
  9. Metadata Description Language, Collectoin and sharing, Conversion ス Schema Description

    Language, collection and sharing, conversion System Development, Community 管理 Organization, systems, ID federation Repository Architecture of data sharing Identifier Data Format Metadata Metadata Schema DOI ORCID FundRef DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked Data Organization Schema System Technology Coordination and Competition Dspace Fedora Weko
  10. Research Activities and Related Entities Survey Article Writing Data Digital

    Articles Acquiring Data Publishing Data Funding agencies Research Institutions affiliated Projects Supported Academic Societies Digital objects Digital objects Topics
  11. Research Activities and Related Entities Survey Article Writing Data Digital

    Articles Acquiring Data Publishing Data Funding agencies Projects Research Institutions affiliated Supported Academic Societies Digital objects Digital objects Topics *% *% *% *% *% *% *% *% *% *% *%
  12. Research Activities and Related Entities Survey Article Writing Acquiring Data

    Publishing Data Funding agencies Projects affiliated Supported *% *% *% *% *% *% *% *% *% *% *% Data Digital Articles Research Institutions Academic Societies Topics
  13. Identifies for research • A research activity is represented with

    a structure of identifies – Planned and submitted – Organized and executed – Concluded and evaluated ID ID ID ID ID ID ID ID ID ID ID
  14. Identifies for research ID ID ID ID ID ID ID

    ID ID ID ID • ID for – Article – Data – Researcher – Institutions, affiliation – Funding agency, funded project – Academic society – Topic – …
  15. Nature of IDs for research Local Global Authorized Open DOI

    ORCID Institution Member ID URI ResearchGate/Academia.e du/… Grant ID Kaken Grant ID Kaken Researcher ID PubMed ID ResearchMap Facebook
  16. Nature of IDs for Science • Balance in some features

    – Global vs. Local • Global:Unified service • Local: Specialized service – Authorized vs. Open • Authorized: Trusted, restricted • Open: no restrictions – Charged vs. Free • Multiple IDs can co-exist in a single category • How to mange multiple IDs – Integration/mapping/associating/discovering – Control/Manage/Authorize – Private/Share/Open
  17. DOI

  18. ç√ç√ ス 管理 Repository DOI in Architecture of Data Sharing

    Identifier Data Format Metadata Schema DOI DataCite Metadata Schema JaLC Metadata Schema JaLC DataCite Metadata Members (data providers) Domain-specific metadata schemata
  19. DOI (Digital Object Identifier) • Service to translate DOI names

    to URIs containing digital objects • Service managed by International DOI Foundation (IDF) • Initially started by STM publishers to share identifiers for digital publications • Distributed management – Delegation of registration tasks to Registration Agencies (RAs)
  20. DOI (Digital Object Identifier) • Service to translate DOI names

    to URIs containing digital objects doi: 10.1007/978-3-642-21616-9_30 http://www.springerlink.com/content /xkj2386758245u85/ DOI URL http://doi.org/10.1007/978-3-642- 21616-9_30 http://www.springerlink.com/content /xkj2386758245u85/ DOI as URL URL
  21. Management Structure of DOI • There Layers: International DOI Foundation

    (IDF), Registration Agency (RA), members • RAs contributesto IDF by registration to Registry DBs, management of Registry DBs, and members fees • RAs offers services for DOI registration to their members • Members can register DOIs to their digital objects through RAs Members RAs IDF CrossRef PublishersPublishers PublishersPublishers DataCite University Library Research Institute JaLC Publisher University Academic Society
  22. Roles of DOI • Provide resolvable, persistent, interoperable links –

    Resolvable: standard syntax + mapping by handle system – Persistent • Technically: management of registry DBs • Socially: organizational operations and duties for members – Interoperability: sharing datamodel
  23. Registration Agencies (RAs) Airiti, Inc. CrossRef China National Knowledge Infrastructure

    (CNKI) DataCite EIDR (Entertainment Identifier Registry) ISTIC (The Institute of Scientific and Technical Information of China) JaLC (Japan Link Center) mEDRA (Multilingual European DOI Registration Agency) OP (Publications Office of the European Union)
  24. CrossRef • Ensure accessibility and citation of articles and books

    in STM publications • Started in 1999 • Largest and oldest RA of IDF – Most of DOI registered are via CrossRef – Members over 70 countries, most are publishers • Functions – DOI Registration – Metadata Management • Bibliographic metadata • Citation – Services with metadata • Search for bibliographic metadata and citation • Reverse look up
  25. Japan Link Center (JaLC) • Founded in March 2012 •

    Aimed to register DOIs for academic contents produced in Japan or in Japanese, to circulate information in Japan and overseas. • Controlled by four national organizations: Ø Japan Science and Technology Agency (JST) Ø National Institute for Materials Science (NIMS) Ø National Institute of Informatics (NII) Ø National Diet Library (NDL) • Operated by JST • Membership system (Academic societies, Publishers, University libraries, etc) • External coordination JaLC is a member of CrossRef and DataCite(Mar. 2014) 42 Over 1,300,000 DOI registered
  26. Content categories 43 Category Journal articles Journal articles Dec.2012 -

    University bulletins Sep.2014 - Conference proceedings Mar.2012 - Books Books Jan.2015 - Doctoral theses Mar.2014 - Reports Technical reports Jan.2015 - Governmental reports Jan.2015 - Researchdata Jan.2015 - e-learning resources Jan.2015 -
  27. Data DOI Registration Flow 44 DOI IDF DOI DOI Article

    CrossRef DOI +CrossRef Matadata - JaLC - CrossRef - DataCite Metadata DOI DOI +Article Matadata DOI +Data Metadata DataCite DOI +DataCite Metadata JaLC Mem.
  28. Experiment Project to register DOIs for Research Data • Goal

    − Establish operation flows to register DOIs for research data and have stable operation • Objectives − Set policies in registering DOIs for research data − Establish operation flows to register DOIs for research data with the next version of JaLC system. Ensure that by performing registration tests − October 2014 – October 2015 45
  29. Members of the project • National Bioscience Database Center (NBDC),

    Japan Science and Technology Agency (JST) • National Institute of Polar Research (NIPR) • National Institute of Informatics (NII) • DIAS-P Project (National Institute of Informatics (NII)) – Japan Agency for Marine-Earth Science and Technology (JAMSTEC) – University of Tokyo – Kyoto University – National Institute for Environmental Studies (NIES) • National Institute of Advanced Industrial Science and Technology (AIST) • National Institute of Information and Communications Technology (NICT) – Kyoto University – National Institute of Informatics (NII) – InfoProtoCo.,Ltd. – Japan Aerospace Exploration Agency (JAXA) – National Institute of Polar Research (NIPR) • Chiba University Library • National Institute for Materials Science (NIMS) • Neuroinformatics Japan Center, Brain Science Institute (BSI), RIKEN
  30. Issues in Data DOI • Flow of operations • Persistent

    access • Granularity of data in registration • Dynamics of data • Landing page • Quantity of data • Applications 48
  31. Issues in Data DOI • Flow of operations: Who, When,

    How − Who registers data?: Researcher/Project manager/Librarian − When is data registered? − How is metadata provided for data? • Persistent access − What persistency can we expect for data? − Can time-limited projects participate? Who will ensure the persistency of the data? (ex.) üThe representative institute takes over all of the data üRegistering DOIs only for data managed by real organizations among the members of the project 49
  32. ID metadata Data Register Create Register Modify save Create publish

    Modify remove Researcher Library Institutional Repository Life cycle of data and stakeholders - in case of literature - 50
  33. ID metadata Data Register Create Register Modify save Create publish

    Modify remove Life cycle of data and stakeholders - in case of data - 51 Create Register Modify Researcher Library Research Institution Project JaLC Metadata Domain Metadata
  34. Issues in Data DOI (cont’d) • Granularity of data in

    registration – Some aspects for granularity of data • Good for citation • Granularity of data itself – Observation data/Experiment data/Simulation data • Easy for access • Easy for management • Quantity of data 52
  35. Issues in Data DOI (cont’d) • Dynamics of data −

    Adding data after registration of DOI − Some options: − Different DOIs − Add relationship metadata to denote the relation to the original DOIs − Use the original DOI − Versioning: add the link to the new data while keep the link to the original data − History of changes in the single DOI − No descriptions (e.g., data in observing) 53
  36. Issues in Data DOI (cont’d) • Landing page − Metadata

    description − For open/closed data • Quantity of data − Registering DOI for a large amount of data • Applications − Citing DOIs for research data − Developing other applications 54
  37. Recommendations for Data DOIs • Recognition of variety of the

    nature of data • Minimal Commitment – Persistency, Interoperability, Usability, manageability • Design own DOI registration policy
  38. ORCID (Open Researcher and Contributor Identifier) • ID for researchers

    and contributors of research to identify uniquely • Managed by ORCID, Inc. (NPO) 2011- – Members: STM publishers, universities, funding agencies • Service started in October, 2012 • How to use ORICD – When submitting manuscripts – Author information in articles – Faculty Management – …
  39. Linked Data • Network of metadata • Sharing metadata among

    RA – CrossRef – DataCite – (JaLC) Image Title Yokoham a Museum Isamu Noguchi [email protected] 1989 近寄るとなぜか覗きたくなって しまう「真夜中の太陽」越しに 「無言のうちに歩いている」を 見る。いつもと違った作品に出 会えます。 Description Work URI URI Creator URI 3-4-1, Minato Mirai, Nishi-ku, Yokohama 045-221-0300 Museum Place URI 真夏の太陽 Date Creator Is_located_in Label Address Phone Category Image Image Name E-address wikipedia
  40. Summary • Open Science backed by data-sharing • Data-sharing architecture

    – Interoperability should be guaranteed – Layers • ID/Metadata Schema/Metadata/Data format/Data/Repository – Cooperation and Competition • DOI is the promising ID for data but different in use from one for literature – DOI registration policy is needed