Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Identifiers and Open Science / JATS-Con-Asia-20151019-01-hideaki-takeda

Identifiers and Open Science / JATS-Con-Asia-20151019-01-hideaki-takeda

JATS-Con Asia
Monday, October 19, 2015
http://jats-con-asia.strikingly.com/

Keynote Speaker 1 "Identifiers and Open Science"
-Hideaki Takeda, National Institute of Informatics (Board of Directors, ORCID)
Abstract: http://jats-con-asia.strikingly.com/#speakers
Materials: https://speakerdeck.com/jatsconasiasc/jats-con-asia-20151019-01-hideaki-takeda
Video: https://vimeo.com/150207019

More Decks by JATS-Con Asia Steering Committee

Other Decks in Technology

Transcript

  1. Open Science and Identifiers
    Hideaki Takeda
    National Institute of Informatics
    [email protected]
    ORCID: 0000-0002-2909-7163

    View Slide

  2. Internet changes our life

    View Slide

  3. Law
    Norm
    Market
    Architecture
    four modalities of regulation (Lawrence Lessig)

    View Slide

  4. So our society is becoming Open Society
    Globalism, Borderless, Cross-culture, Nomad life, …

    View Slide

  5. Internet changes science

    View Slide

  6. Law
    Norm
    Market
    Architecture
    four modalities of regulation (Lawrence Lessig)

    View Slide

  7. So Science is becoming Open Science
    • Open science can be discussed in
    philosophical, political, methodological, or any
    kind of views.
    • “Open Science NOW” is geared and realized
    by Internet as Architecture
    • So data sharing is the core of Open Science

    View Slide

  8. Data sharing

    View Slide

  9. Researcher before Digital Age
    papers
    data
    research target
    Survey Paper working
    Research & Writing

    View Slide

  10. 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    0101
    0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    0101
    Researchers now
    Data use Data publishing
    Research, Writing & Data publishing
    papers
    data
    research target
    Survey Paper working

    View Slide

  11. 0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    0101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100
    0011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010110111
    1100011110000110101010101110101101110101101111100011110000110101010101111100011110000110101010101
    1101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001
    1010101010
    0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    01010
    0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    01010
    0101101110101101111100011110000110101010101111100011110000110101010101110101101111100011110000110
    1010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000
    1111000011010101010111010110111010110111110001111000011010101010111110001111000011010101010111010
    1101111100011110000110101010101111110000110101010111111100111000011010101010011010101010000110101
    0101110001111000011010101010111010110111110001111000011010101010111111000011010101011111110011100
    001101010101001101010101000011010101010
    Researcher in Future
    Data
    Data use Data publishing
    Integration of
    papers & data
    Data publishing
    Research = Data Supply-chain

    View Slide

  12. Data sharing

    View Slide

  13. Data Sharing?
    or
    Data Publication?
    or
    Open Data?

    View Slide

  14. Data Life Cycle
    • Data is created, shared, published, and archived
    • But, just “published” is not enough, it should be
    “openly published” (open data)
    Data Share
    Create Publish Archive
    Research Phase In Progress Results

    View Slide

  15. Open Data
    • “A piece of data or content is open if anyone is free
    to use, reuse, and redistribute it — subject only, at
    most, to the requirement to attribute and/or share-
    alike.” http://opendefinition.org/
    • Open data is data publication with some open
    license
    – Open license ensues the above condition

    View Slide

  16. Data Life Cycle
    • Different tools for different stages of life cycle
    – Data sharing: generating, federating, …
    – Data publishing: searching, harvesting, …
    – Data archiving: migration, …
    • The architecture CAN be shared
    Data Share
    Create Publish Preserve
    Research Phase In Progress Results
    Stakeholder
    Research Institute
    Researcher/R. Group

    View Slide

  17. Why should research data be open?
    But still

    View Slide

  18. Four reasons for openness of research data
    • Demands from Society
    – Knowledge sharing among society
    – Accountability of public money
    • Demands in Science
    – Future development of Science itself
    • “Standing on the shoulders of giants”
    (nanos gigantum humeris insidentes)
    – Reproducibility

    View Slide

  19. Dimensions of Science
    • Local - Global
    • Open - Authorized

    View Slide

  20. Dimensions of Science
    Local
    Global
    Authorized
    Open

    View Slide

  21. Dimensions of Science
    Local
    Global
    Authorized
    Open
    Government
    University
    Publisher
    Citizen Science/Open Science

    View Slide

  22. Dimensions of Science
    Local
    Global
    Authorized
    Open
    Government
    University
    Publisher
    Citizen Science/Open Science
    Various stakeholders stand on different positions on Open Science

    View Slide

  23. Architecture of data sharing

    View Slide

  24. Repository
    Architecture of data sharing
    Identifier
    Data
    Format
    Metadata
    Metadata Schema
    Systematic Integration across the layers
    Interoperability on each layer

    View Slide

  25. Metadata Description Language, Collectoin and sharing, Conversion
    ス Schema Description Language, collection and sharing, conversion
    System Development, Community
    管理 Organization, systems, ID federation
    Repository
    Architecture of data sharing
    Identifier
    Data
    Format
    Metadata
    Metadata Schema
    DOI ORCID FundRef
    DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked Data
    Organization Schema System Technology
    Coordination and Competition
    Dspace Fedora Weko

    View Slide

  26. Research Activities and Related Entities
    Survey
    Article Writing
    Data
    Digital
    Articles
    Acquiring Data
    Publishing Data
    Funding agencies
    Research
    Institutions
    affiliated
    Projects
    Supported
    Academic Societies
    Digital objects Digital objects
    Topics

    View Slide

  27. Research Activities and Related Entities
    Survey
    Article Writing
    Data
    Digital
    Articles
    Acquiring Data
    Publishing Data
    Funding agencies Projects
    Research
    Institutions
    affiliated
    Supported
    Academic Societies
    Digital objects Digital objects
    Topics
    *%
    *% *%
    *%
    *% *%
    *%
    *%
    *%
    *%
    *%

    View Slide

  28. Research Activities and Related Entities
    Survey
    Article Writing
    Acquiring Data
    Publishing Data
    Funding agencies Projects
    affiliated
    Supported
    *%
    *% *%
    *%
    *% *%
    *%
    *%
    *%
    *%
    *%
    Data
    Digital
    Articles
    Research
    Institutions
    Academic Societies
    Topics

    View Slide

  29. Identifies for research
    • A research activity is represented with a
    structure of identifies
    – Planned and submitted
    – Organized and executed
    – Concluded and evaluated
    ID
    ID ID
    ID
    ID
    ID
    ID
    ID
    ID
    ID
    ID

    View Slide

  30. Identifies for research
    ID
    ID ID
    ID
    ID
    ID
    ID
    ID
    ID
    ID
    ID
    • ID for
    – Article
    – Data
    – Researcher
    – Institutions, affiliation
    – Funding agency, funded project
    – Academic society
    – Topic
    – …

    View Slide

  31. Nature of IDs for research
    Local
    Global
    Authorized
    Open
    DOI
    ORCID
    Institution
    Member ID
    URI
    ResearchGate/Academia.e
    du/…
    Grant ID
    Kaken
    Grant ID
    Kaken
    Researcher ID
    PubMed ID
    ResearchMap
    Facebook

    View Slide

  32. Nature of IDs for Science
    • Balance in some features
    – Global vs. Local
    • Global:Unified service
    • Local: Specialized service
    – Authorized vs. Open
    • Authorized: Trusted, restricted
    • Open: no restrictions
    – Charged vs. Free
    • Multiple IDs can co-exist in a single category
    • How to mange multiple IDs
    – Integration/mapping/associating/discovering
    – Control/Manage/Authorize
    – Private/Share/Open

    View Slide

  33. DOI

    View Slide

  34. ç√ç√

    管理
    Repository
    DOI in Architecture of Data Sharing
    Identifier
    Data
    Format
    Metadata Schema
    DOI
    DataCite Metadata Schema
    JaLC Metadata Schema
    JaLC DataCite
    Metadata
    Members (data providers)
    Domain-specific
    metadata schemata

    View Slide

  35. DOI (Digital Object Identifier)
    • Service to translate DOI names to URIs
    containing digital objects
    • Service managed by International DOI
    Foundation (IDF)
    • Initially started by STM publishers to share
    identifiers for digital publications
    • Distributed management
    – Delegation of registration tasks to Registration
    Agencies (RAs)

    View Slide

  36. DOI (Digital Object Identifier)
    • Service to translate DOI names to URIs
    containing digital objects
    doi: 10.1007/978-3-642-21616-9_30
    http://www.springerlink.com/content
    /xkj2386758245u85/
    DOI URL
    http://doi.org/10.1007/978-3-642-
    21616-9_30
    http://www.springerlink.com/content
    /xkj2386758245u85/
    DOI as URL URL

    View Slide

  37. Management Structure of DOI
    • There Layers: International DOI Foundation (IDF), Registration
    Agency (RA), members
    • RAs contributesto IDF by registration to Registry DBs,
    management of Registry DBs, and members fees
    • RAs offers services for DOI registration to their members
    • Members can register DOIs to their digital objects through
    RAs
    Members
    RAs
    IDF
    CrossRef
    PublishersPublishers
    PublishersPublishers
    DataCite
    University
    Library
    Research
    Institute
    JaLC
    Publisher
    University
    Academic
    Society

    View Slide

  38. Roles of DOI
    • Provide resolvable, persistent, interoperable
    links
    – Resolvable: standard syntax + mapping by handle
    system
    – Persistent
    • Technically: management of registry DBs
    • Socially: organizational operations and duties for
    members
    – Interoperability: sharing datamodel

    View Slide

  39. Registration Agencies (RAs)
    Airiti, Inc. CrossRef
    China National Knowledge
    Infrastructure (CNKI)
    DataCite
    EIDR (Entertainment Identifier
    Registry)
    ISTIC (The Institute of
    Scientific and Technical
    Information of China)
    JaLC (Japan Link Center) mEDRA (Multilingual European
    DOI Registration Agency)
    OP (Publications Office of the European Union)

    View Slide

  40. CrossRef
    • Ensure accessibility and citation of articles and
    books in STM publications
    • Started in 1999
    • Largest and oldest RA of IDF
    – Most of DOI registered are via CrossRef
    – Members over 70 countries, most are publishers
    • Functions
    – DOI Registration
    – Metadata Management
    • Bibliographic metadata
    • Citation
    – Services with metadata
    • Search for bibliographic metadata and citation
    • Reverse look up

    View Slide

  41. DataCite
    • IDF RA for research data
    • a not-for-profit organization since 1 December
    2009

    View Slide

  42. Japan Link Center (JaLC)
    • Founded in March 2012
    • Aimed to register DOIs for academic contents produced
    in Japan or in Japanese, to circulate information in Japan and overseas.
    • Controlled by four national organizations:
    Ø Japan Science and Technology Agency (JST)
    Ø National Institute for Materials Science (NIMS)
    Ø National Institute of Informatics (NII)
    Ø National Diet Library (NDL)
    • Operated by JST
    • Membership system
    (Academic societies, Publishers, University libraries, etc)
    • External coordination
    JaLC is a member of CrossRef and DataCite(Mar. 2014)
    42
    Over 1,300,000 DOI registered

    View Slide

  43. Content categories
    43
    Category
    Journal articles
    Journal articles Dec.2012 -
    University bulletins Sep.2014 -
    Conference proceedings Mar.2012 -
    Books
    Books Jan.2015 -
    Doctoral theses Mar.2014 -
    Reports
    Technical reports Jan.2015 -
    Governmental reports Jan.2015 -
    Researchdata Jan.2015 -
    e-learning resources Jan.2015 -

    View Slide

  44. Data
    DOI Registration Flow
    44
    DOI
    IDF
    DOI
    DOI
    Article
    CrossRef DOI
    +CrossRef Matadata
    - JaLC
    - CrossRef
    - DataCite
    Metadata
    DOI
    DOI
    +Article Matadata
    DOI
    +Data Metadata
    DataCite DOI
    +DataCite Metadata
    JaLC
    Mem.

    View Slide

  45. Experiment Project
    to register DOIs for Research Data
    • Goal
    − Establish operation flows to register DOIs for
    research data and have stable operation
    • Objectives
    − Set policies in registering DOIs for research data
    − Establish operation flows to register DOIs for
    research data with the next version of JaLC system.
    Ensure that by performing registration tests
    − October 2014 – October 2015
    45

    View Slide

  46. Members of the project
    9 projects with 14 organizations

    View Slide

  47. Members of the project
    • National Bioscience Database Center (NBDC), Japan Science and Technology
    Agency (JST)
    • National Institute of Polar Research (NIPR)
    • National Institute of Informatics (NII)
    • DIAS-P Project (National Institute of Informatics (NII))
    – Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
    – University of Tokyo
    – Kyoto University
    – National Institute for Environmental Studies (NIES)
    • National Institute of Advanced Industrial Science and Technology (AIST)
    • National Institute of Information and Communications Technology (NICT)
    – Kyoto University
    – National Institute of Informatics (NII)
    – InfoProtoCo.,Ltd.
    – Japan Aerospace Exploration Agency (JAXA)
    – National Institute of Polar Research (NIPR)
    • Chiba University Library
    • National Institute for Materials Science (NIMS)
    • Neuroinformatics Japan Center, Brain Science Institute (BSI), RIKEN

    View Slide

  48. Issues in Data DOI
    • Flow of operations
    • Persistent access
    • Granularity of data in registration
    • Dynamics of data
    • Landing page
    • Quantity of data
    • Applications
    48

    View Slide

  49. Issues in Data DOI
    • Flow of operations: Who, When, How
    − Who registers data?: Researcher/Project
    manager/Librarian
    − When is data registered?
    − How is metadata provided for data?
    • Persistent access
    − What persistency can we expect for data?
    − Can time-limited projects participate? Who will ensure the
    persistency of the data?
    (ex.)
    üThe representative institute takes over all of the data
    üRegistering DOIs only for data managed by real organizations
    among the members of the project
    49

    View Slide

  50. ID
    metadata
    Data
    Register
    Create Register Modify
    save
    Create publish Modify remove
    Researcher
    Library
    Institutional Repository
    Life cycle of data and stakeholders
    - in case of literature -
    50

    View Slide

  51. ID
    metadata
    Data
    Register
    Create Register Modify
    save
    Create publish Modify remove
    Life cycle of data and stakeholders
    - in case of data -
    51
    Create Register Modify
    Researcher
    Library Research Institution
    Project
    JaLC
    Metadata
    Domain
    Metadata

    View Slide

  52. Issues in Data DOI (cont’d)
    • Granularity of data in registration
    – Some aspects for granularity of data
    • Good for citation
    • Granularity of data itself
    – Observation data/Experiment data/Simulation data
    • Easy for access
    • Easy for management
    • Quantity of data
    52

    View Slide

  53. Issues in Data DOI (cont’d)
    • Dynamics of data
    − Adding data after registration of DOI
    − Some options:
    − Different DOIs
    − Add relationship metadata to denote the relation to the original
    DOIs
    − Use the original DOI
    − Versioning: add the link to the new data while keep the link to the
    original data
    − History of changes in the single DOI
    − No descriptions (e.g., data in observing)
    53

    View Slide

  54. Issues in Data DOI (cont’d)
    • Landing page
    − Metadata description
    − For open/closed data
    • Quantity of data
    − Registering DOI for a large amount of data
    • Applications
    − Citing DOIs for research data
    − Developing other applications
    54

    View Slide

  55. Recommendations for Data DOIs
    • Recognition of variety of the nature of data
    • Minimal Commitment
    – Persistency, Interoperability, Usability,
    manageability
    • Design own DOI registration policy

    View Slide

  56. ID for Researchers

    View Slide

  57. ORCID
    (Open Researcher and Contributor Identifier)
    • ID for researchers and contributors of research to
    identify uniquely
    • Managed by ORCID, Inc. (NPO) 2011-
    – Members: STM publishers, universities, funding
    agencies
    • Service started in October, 2012
    • How to use ORICD
    – When submitting manuscripts
    – Author information in articles
    – Faculty Management
    – …

    View Slide

  58. View Slide

  59. View Slide

  60. View Slide

  61. Metadata Management

    View Slide

  62. Linked Data
    • Network of metadata
    • Sharing metadata
    among RA
    – CrossRef
    – DataCite
    – (JaLC) Image
    Title
    Yokoham
    a
    Museum
    Isamu Noguchi
    [email protected]
    1989
    近寄るとなぜか覗きたくなって
    しまう「真夜中の太陽」越しに
    「無言のうちに歩いている」を
    見る。いつもと違った作品に出
    会えます。
    Description
    Work
    URI
    URI
    Creator
    URI
    3-4-1, Minato
    Mirai, Nishi-ku,
    Yokohama
    045-221-0300
    Museum
    Place
    URI
    真夏の太陽
    Date
    Creator
    Is_located_in
    Label Address
    Phone
    Category
    Image
    Image
    Name
    E-address
    wikipedia

    View Slide

  63. Summary
    • Open Science backed by data-sharing
    • Data-sharing architecture
    – Interoperability should be guaranteed
    – Layers
    • ID/Metadata Schema/Metadata/Data
    format/Data/Repository
    – Cooperation and Competition
    • DOI is the promising ID for data but different in
    use from one for literature
    – DOI registration policy is needed

    View Slide