NETSCITY tutorial – GEOCOLLAB Workshop 2022 Unveiling world scale scientific production and collaborations between cities

GEOCOLLAB project Two main research areas: oMarine science (research on seaweeds and algae) o Gene editing o Else? Possible sources (what do we already have access to?): o Bibliographic databases (WoS, Scopus, …) o Conference data (to be identified) oResearch Projects data (ERC, ANR, Research in Svalbard…) o Contracts (science-industry partnerships); Patents… o Phd theses (, etc.)

Spatial information • Postal address: 5 cours des Humanités, 93000, Aubervilliers, FR • Organisation (without a city field): University of Edinburgh, Scotland, UK (It can be in Roslin or Edinburgh) • Organisation + ROR/Grid ID (; OpenAlex) Caltech, Max Planck Society, (with many child institutes)

Existing softwares for bibliometric mapping CiteSpace, Leydesdorff’s programs and Sci2 Tool o geocoding data at the street level o mapping network data using Google Earth Maps and Yahoo! Maps using KML files From Chen, 2016 A practical guide for mapping Scientific litterature

Bibliographic data – what is the row material we need? « CITY, PROVINCE, COUNTRY » city province country Monterotondo RM Italy Milan Italy Rome Italy Khania Greece POSTAL ADDRESSES

Other types of sources: e.g. conference attendance Netconf project With B. Bernela & F. Briatte

Map by M. Maisonobe, CNRS. Data: LORD & TAI-NUI

Data processing : 1) Extraction of addresses 1) Geocoding 1) Clustering at the urban area/country levels File with bibliographic metadata Sources: Web Of Science, Scopus, or personal files in .csv Input Netscity Outputs ● Cartes ● Tables ● Fichier d’export Context

Many heterogeneities, transliteration issues and data entry errors

Why should we agregate the geocoded data at the urban area level? The output of the geocoding process for 2012 Web of Science publications – Québec area

Why should we agregate the geocoded data at the urban area level? The output of the geocoding process for 2012 Web of Science publications – London area

The case of Rome

The variable administrative fragmentation of the territory at the world level

Methods: grouping into agglomerations Issues ▪ Group together publication sites that are in the same urban area. ▪ Globally comparable urban areas, despite very different urban realities ➢ Search for a delimitation adapted to the urban phenomenon ➢ Delimitation by spatial crossing between urban population density and scientific publications’ spatial distribution Maisonobe, Jégou & Eckert, 2018, Delineating urban agglomerations across the world: a dataset for studying the spatial distribution of academic research at city level DOI : 10.4000/cybergeo.29637

Counting methods: arbitrating bet. Full & fractional countings References: Van Hooydonk, 1997; Gauffriau et al., 2008, Leydesdorff & Park, 2017 • Full: the total number of addresses/urban areas/countries per publications • Fractional: the sum of each fractioned credit totals one (avoiding double counts) → With NETSCITY the reference unit for normalization can be the address, the urban area or the country

Counting methods: arbitrating bet. Full & fractional countings 2 variables can be normalised and mapped with NETSCITY 1. Number of publications/projects per geographical entity (the total number geographical entities involved in a publication/project) 1. Intensity of scientific collaboration between geographical entities (the total number of links between the geographical entities involved in a publication/project) For instance, if a given publication stems from three different urban areas, each inter-urban link receives 1/3 as a weight for this publication. More generally, if a publication is co-signed from 𝑛 urban areas, each pair of urban areas (A, B), with A < B, is assigned a value 𝑙 equals to: 1/𝑛(𝑛 − 1)/2 = 2/(𝑛(𝑛 − 1))

Weighted projection method This method is the normalised counting method used in the web application NETSCITY (Maisonobe et al. 2019) A variant: « Newman » projection method (2001). See 2-mode 1-mode

Query on the Web of Science Core Collection

Export format (Tab-delimited)

Select the source format

Wait while the geocoding

Geocoding report

Manual correction

Export and/or check the address table

Normalised nb of publications per country

Normalised nb of publications per urban area

Normalised nb of collab. bet. countries

Normalised nb of collab. bet. urban areas

Diagrammes & histogrammes (.csv et .jpg can be exported)

Stock map at the country level

Stock map at the urban area level

Flow map bet. countries

Carte des collaborations fractionnées entre aires urbaines

Network of collab. bet. countries

Network of collab. bet. urban areas

Application case Go on page: ectocarpus-corpus-march-2022/

Seaweeds and algae research Topic query: TS = (seaweed* OR alga OR kelp* OR algae* OR algal* OR seagrass* OR sea plant* OR phyco* OR Chlorell* OR Protothec* OR Charophy* OR Chlorophyt* OR Rhodophy* OR Cryptophy* OR Haptophy* OR Charophy* OR Chlorarachniophy* OR Glaucophy* OR rockweed* OR dulse* OR dillisk* OR dilsk* OR carragheen moss* OR sea lettuce* OR Chondrus) Should we add: microalga, macroalga, phytoplankton, cyanobacteria? Else? See:

The scientific production on algae and seaweeds in Scotland

Distribution of Scotland’s publications on algae per scientific specialities

Prospects • Collaborative space: internal repository for the project with useful datasets and scripts • R workshop and tutorial (tidyverse + cartigraph) • Issue of corpus delineation

Reference Marion Maisonobe, Laurent Jégou, Nikita Yakimovich, Guillaume Cabanac (2019). NETSCITY: a geospatial application to analyse and map world scale production and collaboration data between cities. In ISSI’19: Proceedings of the 17th International Conference on Scientometrics and Informetrics, Tome 1, p. 631-642, Rome: Edizioni Efesto. [PDF]