Identifiers and Open Science / JATS-Con-Asia-20151019-01-hideaki-takeda

Identifiers and Open Science / JATS-Con-Asia-20151019-01-hideaki-takeda

JATS-Con Asia
Monday, October 19, 2015
http://jats-con-asia.strikingly.com/

Keynote Speaker 1 "Identifiers and Open Science"
-Hideaki Takeda, National Institute of Informatics (Board of Directors, ORCID)
Abstract: http://jats-con-asia.strikingly.com/#speakers
Materials: https://speakerdeck.com/jatsconasiasc/jats-con-asia-20151019-01-hideaki-takeda
Video: https://vimeo.com/150207019

Transcript

  1. Open Science and Identifiers Hideaki Takeda National Institute of Informatics

    takeda@nii.ac.jp ORCID: 0000-0002-2909-7163
  2. Internet changes our life

  3. Law Norm Market Architecture four modalities of regulation (Lawrence Lessig)

  4. So our society is becoming Open Society Globalism, Borderless, Cross-culture,

    Nomad life, …
  5. Internet changes science

  6. Law Norm Market Architecture four modalities of regulation (Lawrence Lessig)

  7. So Science is becoming Open Science • Open science can

    be discussed in philosophical, political, methodological, or any kind of views. • “Open Science NOW” is geared and realized by Internet as Architecture • So data sharing is the core of Open Science
  8. Data sharing

  9. Researcher before Digital Age papers data research target Survey Paper

    working Research & Writing
  10. 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101

    Researchers now Data use Data publishing Research, Writing & Data publishing papers data research target Survey Paper working
  11. 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100 0011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010110111 1100011110000110101010101110101101110101101111100011110000110101010101111100011110000110101010101 1101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001 1010101010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110

    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 01010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 01010 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110 1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000 1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010 1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101 0101110001111000011010101010111010110111110001111000011010101010111111000011010101011111110011100 001101010101001101010101000011010101010 Researcher in Future Data Data use Data publishing Integration of papers & data Data publishing Research = Data Supply-chain
  12. Data sharing

  13. Data Sharing? or Data Publication? or Open Data?

  14. Data Life Cycle • Data is created, shared, published, and

    archived • But, just “published” is not enough, it should be “openly published” (open data) Data Share Create Publish Archive Research Phase In Progress Results
  15. Open Data • “A piece of data or content is

    open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share- alike.” http://opendefinition.org/ • Open data is data publication with some open license – Open license ensues the above condition
  16. Data Life Cycle • Different tools for different stages of

    life cycle – Data sharing: generating, federating, … – Data publishing: searching, harvesting, … – Data archiving: migration, … • The architecture CAN be shared Data Share Create Publish Preserve Research Phase In Progress Results Stakeholder Research Institute Researcher/R. Group
  17. Why should research data be open? But still

  18. Four reasons for openness of research data • Demands from

    Society – Knowledge sharing among society – Accountability of public money • Demands in Science – Future development of Science itself • “Standing on the shoulders of giants” (nanos gigantum humeris insidentes) – Reproducibility
  19. Dimensions of Science • Local - Global • Open -

    Authorized
  20. Dimensions of Science Local Global Authorized Open

  21. Dimensions of Science Local Global Authorized Open Government University Publisher

    Citizen Science/Open Science
  22. Dimensions of Science Local Global Authorized Open Government University Publisher

    Citizen Science/Open Science Various stakeholders stand on different positions on Open Science
  23. Architecture of data sharing

  24. Repository Architecture of data sharing Identifier Data Format Metadata Metadata

    Schema Systematic Integration across the layers Interoperability on each layer
  25. Metadata Description Language, Collectoin and sharing, Conversion ス Schema Description

    Language, collection and sharing, conversion System Development, Community 管理 Organization, systems, ID federation Repository Architecture of data sharing Identifier Data Format Metadata Metadata Schema DOI ORCID FundRef DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked Data Organization Schema System Technology Coordination and Competition Dspace Fedora Weko
  26. Research Activities and Related Entities Survey Article Writing Data Digital

    Articles Acquiring Data Publishing Data Funding agencies Research Institutions affiliated Projects Supported Academic Societies Digital objects Digital objects Topics
  27. Research Activities and Related Entities Survey Article Writing Data Digital

    Articles Acquiring Data Publishing Data Funding agencies Projects Research Institutions affiliated Supported Academic Societies Digital objects Digital objects Topics *% *% *% *% *% *% *% *% *% *% *%
  28. Research Activities and Related Entities Survey Article Writing Acquiring Data

    Publishing Data Funding agencies Projects affiliated Supported *% *% *% *% *% *% *% *% *% *% *% Data Digital Articles Research Institutions Academic Societies Topics
  29. Identifies for research • A research activity is represented with

    a structure of identifies – Planned and submitted – Organized and executed – Concluded and evaluated ID ID ID ID ID ID ID ID ID ID ID
  30. Identifies for research ID ID ID ID ID ID ID

    ID ID ID ID • ID for – Article – Data – Researcher – Institutions, affiliation – Funding agency, funded project – Academic society – Topic – …
  31. Nature of IDs for research Local Global Authorized Open DOI

    ORCID Institution Member ID URI ResearchGate/Academia.e du/… Grant ID Kaken Grant ID Kaken Researcher ID PubMed ID ResearchMap Facebook
  32. Nature of IDs for Science • Balance in some features

    – Global vs. Local • Global:Unified service • Local: Specialized service – Authorized vs. Open • Authorized: Trusted, restricted • Open: no restrictions – Charged vs. Free • Multiple IDs can co-exist in a single category • How to mange multiple IDs – Integration/mapping/associating/discovering – Control/Manage/Authorize – Private/Share/Open
  33. DOI

  34. ç√ç√ ス 管理 Repository DOI in Architecture of Data Sharing

    Identifier Data Format Metadata Schema DOI DataCite Metadata Schema JaLC Metadata Schema JaLC DataCite Metadata Members (data providers) Domain-specific metadata schemata
  35. DOI (Digital Object Identifier) • Service to translate DOI names

    to URIs containing digital objects • Service managed by International DOI Foundation (IDF) • Initially started by STM publishers to share identifiers for digital publications • Distributed management – Delegation of registration tasks to Registration Agencies (RAs)
  36. DOI (Digital Object Identifier) • Service to translate DOI names

    to URIs containing digital objects doi: 10.1007/978-3-642-21616-9_30 http://www.springerlink.com/content /xkj2386758245u85/ DOI URL http://doi.org/10.1007/978-3-642- 21616-9_30 http://www.springerlink.com/content /xkj2386758245u85/ DOI as URL URL
  37. Management Structure of DOI • There Layers: International DOI Foundation

    (IDF), Registration Agency (RA), members • RAs contributesto IDF by registration to Registry DBs, management of Registry DBs, and members fees • RAs offers services for DOI registration to their members • Members can register DOIs to their digital objects through RAs Members RAs IDF CrossRef PublishersPublishers PublishersPublishers DataCite University Library Research Institute JaLC Publisher University Academic Society
  38. Roles of DOI • Provide resolvable, persistent, interoperable links –

    Resolvable: standard syntax + mapping by handle system – Persistent • Technically: management of registry DBs • Socially: organizational operations and duties for members – Interoperability: sharing datamodel
  39. Registration Agencies (RAs) Airiti, Inc. CrossRef China National Knowledge Infrastructure

    (CNKI) DataCite EIDR (Entertainment Identifier Registry) ISTIC (The Institute of Scientific and Technical Information of China) JaLC (Japan Link Center) mEDRA (Multilingual European DOI Registration Agency) OP (Publications Office of the European Union)
  40. CrossRef • Ensure accessibility and citation of articles and books

    in STM publications • Started in 1999 • Largest and oldest RA of IDF – Most of DOI registered are via CrossRef – Members over 70 countries, most are publishers • Functions – DOI Registration – Metadata Management • Bibliographic metadata • Citation – Services with metadata • Search for bibliographic metadata and citation • Reverse look up
  41. DataCite • IDF RA for research data • a not-for-profit

    organization since 1 December 2009
  42. Japan Link Center (JaLC) • Founded in March 2012 •

    Aimed to register DOIs for academic contents produced in Japan or in Japanese, to circulate information in Japan and overseas. • Controlled by four national organizations: Ø Japan Science and Technology Agency (JST) Ø National Institute for Materials Science (NIMS) Ø National Institute of Informatics (NII) Ø National Diet Library (NDL) • Operated by JST • Membership system (Academic societies, Publishers, University libraries, etc) • External coordination JaLC is a member of CrossRef and DataCite(Mar. 2014) 42 Over 1,300,000 DOI registered
  43. Content categories 43 Category Journal articles Journal articles Dec.2012 -

    University bulletins Sep.2014 - Conference proceedings Mar.2012 - Books Books Jan.2015 - Doctoral theses Mar.2014 - Reports Technical reports Jan.2015 - Governmental reports Jan.2015 - Researchdata Jan.2015 - e-learning resources Jan.2015 -
  44. Data DOI Registration Flow 44 DOI IDF DOI DOI Article

    CrossRef DOI +CrossRef Matadata - JaLC - CrossRef - DataCite Metadata DOI DOI +Article Matadata DOI +Data Metadata DataCite DOI +DataCite Metadata JaLC Mem.
  45. Experiment Project to register DOIs for Research Data • Goal

    − Establish operation flows to register DOIs for research data and have stable operation • Objectives − Set policies in registering DOIs for research data − Establish operation flows to register DOIs for research data with the next version of JaLC system. Ensure that by performing registration tests − October 2014 – October 2015 45
  46. Members of the project 9 projects with 14 organizations

  47. Members of the project • National Bioscience Database Center (NBDC),

    Japan Science and Technology Agency (JST) • National Institute of Polar Research (NIPR) • National Institute of Informatics (NII) • DIAS-P Project (National Institute of Informatics (NII)) – Japan Agency for Marine-Earth Science and Technology (JAMSTEC) – University of Tokyo – Kyoto University – National Institute for Environmental Studies (NIES) • National Institute of Advanced Industrial Science and Technology (AIST) • National Institute of Information and Communications Technology (NICT) – Kyoto University – National Institute of Informatics (NII) – InfoProtoCo.,Ltd. – Japan Aerospace Exploration Agency (JAXA) – National Institute of Polar Research (NIPR) • Chiba University Library • National Institute for Materials Science (NIMS) • Neuroinformatics Japan Center, Brain Science Institute (BSI), RIKEN
  48. Issues in Data DOI • Flow of operations • Persistent

    access • Granularity of data in registration • Dynamics of data • Landing page • Quantity of data • Applications 48
  49. Issues in Data DOI • Flow of operations: Who, When,

    How − Who registers data?: Researcher/Project manager/Librarian − When is data registered? − How is metadata provided for data? • Persistent access − What persistency can we expect for data? − Can time-limited projects participate? Who will ensure the persistency of the data? (ex.) üThe representative institute takes over all of the data üRegistering DOIs only for data managed by real organizations among the members of the project 49
  50. ID metadata Data Register Create Register Modify save Create publish

    Modify remove Researcher Library Institutional Repository Life cycle of data and stakeholders - in case of literature - 50
  51. ID metadata Data Register Create Register Modify save Create publish

    Modify remove Life cycle of data and stakeholders - in case of data - 51 Create Register Modify Researcher Library Research Institution Project JaLC Metadata Domain Metadata
  52. Issues in Data DOI (cont’d) • Granularity of data in

    registration – Some aspects for granularity of data • Good for citation • Granularity of data itself – Observation data/Experiment data/Simulation data • Easy for access • Easy for management • Quantity of data 52
  53. Issues in Data DOI (cont’d) • Dynamics of data −

    Adding data after registration of DOI − Some options: − Different DOIs − Add relationship metadata to denote the relation to the original DOIs − Use the original DOI − Versioning: add the link to the new data while keep the link to the original data − History of changes in the single DOI − No descriptions (e.g., data in observing) 53
  54. Issues in Data DOI (cont’d) • Landing page − Metadata

    description − For open/closed data • Quantity of data − Registering DOI for a large amount of data • Applications − Citing DOIs for research data − Developing other applications 54
  55. Recommendations for Data DOIs • Recognition of variety of the

    nature of data • Minimal Commitment – Persistency, Interoperability, Usability, manageability • Design own DOI registration policy
  56. ID for Researchers

  57. ORCID (Open Researcher and Contributor Identifier) • ID for researchers

    and contributors of research to identify uniquely • Managed by ORCID, Inc. (NPO) 2011- – Members: STM publishers, universities, funding agencies • Service started in October, 2012 • How to use ORICD – When submitting manuscripts – Author information in articles – Faculty Management – …
  58. None
  59. None
  60. None
  61. Metadata Management

  62. Linked Data • Network of metadata • Sharing metadata among

    RA – CrossRef – DataCite – (JaLC) Image Title Yokoham a Museum Isamu Noguchi isamu@noguchi.jp 1989 近寄るとなぜか覗きたくなって しまう「真夜中の太陽」越しに 「無言のうちに歩いている」を 見る。いつもと違った作品に出 会えます。 Description Work URI URI Creator URI 3-4-1, Minato Mirai, Nishi-ku, Yokohama 045-221-0300 Museum Place URI 真夏の太陽 Date Creator Is_located_in Label Address Phone Category Image Image Name E-address wikipedia
  63. Summary • Open Science backed by data-sharing • Data-sharing architecture

    – Interoperability should be guaranteed – Layers • ID/Metadata Schema/Metadata/Data format/Data/Repository – Cooperation and Competition • DOI is the promising ID for data but different in use from one for literature – DOI registration policy is needed