Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Netscity workshop - Geocollab Project

MarionMai
March 10, 2022

Netscity workshop - Geocollab Project

MarionMai

March 10, 2022
Tweet

More Decks by MarionMai

Other Decks in Research

Transcript

  1. NETSCITY tutorial – GEOCOLLAB Workshop 2022
    Unveiling world scale scientific production and collaborations between
    cities

    View Slide

  2. GEOCOLLAB project
    Two main research areas:
    oMarine science (research on seaweeds and algae)
    o Gene editing
    o Else?
    Possible sources (what do we already have access to?):
    o Bibliographic databases (WoS, Scopus, …)
    o Conference data (to be identified)
    oResearch Projects data (ERC, ANR, Research in Svalbard…)
    o Contracts (science-industry partnerships); Patents…
    o Phd theses (Theses.fr, etc.)

    View Slide

  3. Spatial information
    • Postal address:
    5 cours des Humanités, 93000, Aubervilliers, FR
    • Organisation (without a city field):
    University of Edinburgh, Scotland, UK (It can be in Roslin or Edinburgh)
    • Organisation + ROR/Grid ID (Dimensions.ai; OpenAlex)
    Caltech, https://ror.org/05dxps055
    Max Planck Society, https://ror.org/01hhn8329 (with many child institutes)

    View Slide

  4. Existing softwares for bibliometric mapping
    CiteSpace, Leydesdorff’s programs and Sci2 Tool
    o geocoding data at the street level
    o mapping network data using Google Earth Maps and Yahoo! Maps using KML files
    From Chen, 2016
    A practical guide for mapping
    Scientific litterature

    View Slide

  5. Bibliographic data – what is the row material we need?
    « CITY, PROVINCE, COUNTRY »
    city province country
    Monterotondo RM Italy
    Milan Italy
    Rome Italy
    Khania Greece
    POSTAL ADDRESSES

    View Slide

  6. Other types of sources: e.g. conference attendance
    Netconf project
    With B. Bernela
    & F. Briatte

    View Slide

  7. Map by M. Maisonobe, CNRS. Data: LORD & TAI-NUI

    View Slide

  8. Data processing :
    1) Extraction of
    addresses
    1) Geocoding
    1) Clustering at the
    urban
    area/country
    levels
    File with bibliographic
    metadata
    Sources: Web Of
    Science, Scopus, or
    personal files in .csv
    Input Netscity Outputs

    Cartes
    ● Tables
    ● Fichier
    d’export
    Context

    View Slide

  9. Many heterogeneities, transliteration issues and data entry errors

    View Slide

  10. Why should we agregate the geocoded data at the urban area level?
    The output of the geocoding process for 2012 Web of Science publications – Québec area

    View Slide

  11. Why should we agregate the geocoded data at the urban area level?
    The output of the geocoding process for 2012 Web of Science publications – London area

    View Slide

  12. The case of Rome

    View Slide

  13. The variable administrative fragmentation of the territory at the world level

    View Slide

  14. Methods: grouping into agglomerations
    Issues
    ▪ Group together publication sites that are in the same urban area.
    ▪ Globally comparable urban areas, despite very different urban realities
    ➢ Search for a delimitation adapted to the urban phenomenon
    ➢ Delimitation by spatial crossing between urban population density and
    scientific publications’ spatial distribution
    Maisonobe, Jégou & Eckert, 2018, Delineating urban agglomerations across the world: a dataset for
    studying the spatial distribution of academic research at city level
    DOI : 10.4000/cybergeo.29637

    View Slide

  15. View Slide

  16. Counting methods:
    arbitrating bet. Full & fractional countings
    References: Van Hooydonk, 1997; Gauffriau et al., 2008, Leydesdorff & Park, 2017
    • Full: the total number of addresses/urban areas/countries per publications
    • Fractional: the sum of each fractioned credit totals one (avoiding double counts)
    → With NETSCITY the reference unit for normalization can be the address, the urban area
    or the country

    View Slide

  17. Counting methods:
    arbitrating bet. Full & fractional countings
    2 variables can be normalised and mapped with NETSCITY
    1. Number of publications/projects per geographical entity (the total number
    geographical entities involved in a publication/project)
    1. Intensity of scientific collaboration between geographical entities (the total
    number of links between the geographical entities involved in a
    publication/project)
    For instance, if a given publication stems from three different urban areas, each inter-urban link receives
    1/3 as a weight for this publication. More generally, if a publication is co-signed from 𝑛 urban areas, each
    pair of urban areas (A, B), with A < B,
    is assigned a value 𝑙 equals to:
    1/𝑛(𝑛 − 1)/2 = 2/(𝑛(𝑛 − 1))

    View Slide

  18. Weighted projection method
    This method is the normalised counting method used in the web application NETSCITY (Maisonobe et al. 2019)
    A variant: « Newman » projection method (2001). See https://toreopsahl.com/tnet/two-mode-networks/
    2-mode 1-mode

    View Slide

  19. Query on the Web of Science Core Collection

    View Slide

  20. Export format (Tab-delimited)

    View Slide

  21. https://www.irit.fr/netscity/
    The example of scientific production about Ectocarpus indexed in the WoS CC between 1920 & 2022

    View Slide

  22. Select the source format

    View Slide

  23. Wait while the geocoding

    View Slide

  24. Geocoding report

    View Slide

  25. Manual correction

    View Slide

  26. Export and/or check the address table

    View Slide

  27. View Slide

  28. Normalised nb of publications per country

    View Slide

  29. Normalised nb of publications per urban area

    View Slide

  30. Normalised nb of collab. bet. countries

    View Slide

  31. Normalised nb of collab. bet. urban areas

    View Slide

  32. Diagrammes & histogrammes (.csv et
    .jpg can be exported)

    View Slide

  33. Stock map at the country level

    View Slide

  34. Stock map at the urban area level

    View Slide

  35. Flow map bet. countries

    View Slide

  36. Carte des collaborations fractionnées
    entre aires urbaines

    View Slide

  37. Network of collab. bet. countries

    View Slide

  38. Network of collab. bet. urban areas

    View Slide

  39. Application case
    Go on page: https://geoscimo.univ-tlse2.fr/analysis-of-the-
    ectocarpus-corpus-march-2022/

    View Slide

  40. Seaweeds and algae research
    Topic query:
    TS = (seaweed* OR alga OR kelp* OR algae* OR algal* OR seagrass* OR sea plant* OR phyco*
    OR Chlorell* OR Protothec* OR Charophy* OR Chlorophyt* OR Rhodophy* OR Cryptophy* OR
    Haptophy* OR Charophy* OR Chlorarachniophy* OR Glaucophy* OR rockweed* OR dulse* OR
    dillisk* OR dilsk* OR carragheen moss* OR sea lettuce* OR Chondrus)
    Should we add: microalga, macroalga, phytoplankton, cyanobacteria? Else?
    See: https://www.sciencedirect.com/science/article/pii/B9780128170762000135

    View Slide

  41. The scientific production on algae and seaweeds in Scotland

    View Slide

  42. Distribution of Scotland’s publications on algae per scientific specialities

    View Slide

  43. View Slide

  44. View Slide

  45. View Slide

  46. Prospects
    • Collaborative space: internal repository for the project with
    useful datasets and scripts
    • R workshop and tutorial (tidyverse + cartigraph)
    • Issue of corpus delineation

    View Slide

  47. Reference
    Marion Maisonobe, Laurent Jégou, Nikita Yakimovich, Guillaume
    Cabanac (2019).
    NETSCITY: a geospatial application to analyse and map world
    scale production and collaboration data between cities.
    In ISSI’19: Proceedings of the 17th International Conference on
    Scientometrics and Informetrics, Tome 1, p. 631-642, Rome:
    Edizioni Efesto. [PDF]

    View Slide