Slide 1

Slide 1 text

NETSCITY tutorial – GEOCOLLAB Workshop 2022 Unveiling world scale scientific production and collaborations between cities

Slide 2

Slide 2 text

GEOCOLLAB project Two main research areas: oMarine science (research on seaweeds and algae) o Gene editing o Else? Possible sources (what do we already have access to?): o Bibliographic databases (WoS, Scopus, …) o Conference data (to be identified) oResearch Projects data (ERC, ANR, Research in Svalbard…) o Contracts (science-industry partnerships); Patents… o Phd theses (Theses.fr, etc.)

Slide 3

Slide 3 text

Spatial information • Postal address: 5 cours des Humanités, 93000, Aubervilliers, FR • Organisation (without a city field): University of Edinburgh, Scotland, UK (It can be in Roslin or Edinburgh) • Organisation + ROR/Grid ID (Dimensions.ai; OpenAlex) Caltech, https://ror.org/05dxps055 Max Planck Society, https://ror.org/01hhn8329 (with many child institutes)

Slide 4

Slide 4 text

Existing softwares for bibliometric mapping CiteSpace, Leydesdorff’s programs and Sci2 Tool o geocoding data at the street level o mapping network data using Google Earth Maps and Yahoo! Maps using KML files From Chen, 2016 A practical guide for mapping Scientific litterature

Slide 5

Slide 5 text

Bibliographic data – what is the row material we need? « CITY, PROVINCE, COUNTRY » city province country Monterotondo RM Italy Milan Italy Rome Italy Khania Greece POSTAL ADDRESSES

Slide 6

Slide 6 text

Other types of sources: e.g. conference attendance Netconf project With B. Bernela & F. Briatte

Slide 7

Slide 7 text

Map by M. Maisonobe, CNRS. Data: LORD & TAI-NUI

Slide 8

Slide 8 text

Data processing : 1) Extraction of addresses 1) Geocoding 1) Clustering at the urban area/country levels File with bibliographic metadata Sources: Web Of Science, Scopus, or personal files in .csv Input Netscity Outputs ● Cartes ● Tables ● Fichier d’export Context

Slide 9

Slide 9 text

Many heterogeneities, transliteration issues and data entry errors

Slide 10

Slide 10 text

Why should we agregate the geocoded data at the urban area level? The output of the geocoding process for 2012 Web of Science publications – Québec area

Slide 11

Slide 11 text

Why should we agregate the geocoded data at the urban area level? The output of the geocoding process for 2012 Web of Science publications – London area

Slide 12

Slide 12 text

The case of Rome

Slide 13

Slide 13 text

The variable administrative fragmentation of the territory at the world level

Slide 14

Slide 14 text

Methods: grouping into agglomerations Issues ▪ Group together publication sites that are in the same urban area. ▪ Globally comparable urban areas, despite very different urban realities ➢ Search for a delimitation adapted to the urban phenomenon ➢ Delimitation by spatial crossing between urban population density and scientific publications’ spatial distribution Maisonobe, Jégou & Eckert, 2018, Delineating urban agglomerations across the world: a dataset for studying the spatial distribution of academic research at city level DOI : 10.4000/cybergeo.29637

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Counting methods: arbitrating bet. Full & fractional countings References: Van Hooydonk, 1997; Gauffriau et al., 2008, Leydesdorff & Park, 2017 • Full: the total number of addresses/urban areas/countries per publications • Fractional: the sum of each fractioned credit totals one (avoiding double counts) → With NETSCITY the reference unit for normalization can be the address, the urban area or the country

Slide 17

Slide 17 text

Counting methods: arbitrating bet. Full & fractional countings 2 variables can be normalised and mapped with NETSCITY 1. Number of publications/projects per geographical entity (the total number geographical entities involved in a publication/project) 1. Intensity of scientific collaboration between geographical entities (the total number of links between the geographical entities involved in a publication/project) For instance, if a given publication stems from three different urban areas, each inter-urban link receives 1/3 as a weight for this publication. More generally, if a publication is co-signed from 𝑛 urban areas, each pair of urban areas (A, B), with A < B, is assigned a value 𝑙 equals to: 1/𝑛(𝑛 − 1)/2 = 2/(𝑛(𝑛 − 1))

Slide 18

Slide 18 text

Weighted projection method This method is the normalised counting method used in the web application NETSCITY (Maisonobe et al. 2019) A variant: « Newman » projection method (2001). See https://toreopsahl.com/tnet/two-mode-networks/ 2-mode 1-mode

Slide 19

Slide 19 text

Query on the Web of Science Core Collection

Slide 20

Slide 20 text

Export format (Tab-delimited)

Slide 21

Slide 21 text

https://www.irit.fr/netscity/ The example of scientific production about Ectocarpus indexed in the WoS CC between 1920 & 2022

Slide 22

Slide 22 text

Select the source format

Slide 23

Slide 23 text

Wait while the geocoding

Slide 24

Slide 24 text

Geocoding report

Slide 25

Slide 25 text

Manual correction

Slide 26

Slide 26 text

Export and/or check the address table

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

Normalised nb of publications per country

Slide 29

Slide 29 text

Normalised nb of publications per urban area

Slide 30

Slide 30 text

Normalised nb of collab. bet. countries

Slide 31

Slide 31 text

Normalised nb of collab. bet. urban areas

Slide 32

Slide 32 text

Diagrammes & histogrammes (.csv et .jpg can be exported)

Slide 33

Slide 33 text

Stock map at the country level

Slide 34

Slide 34 text

Stock map at the urban area level

Slide 35

Slide 35 text

Flow map bet. countries

Slide 36

Slide 36 text

Carte des collaborations fractionnées entre aires urbaines

Slide 37

Slide 37 text

Network of collab. bet. countries

Slide 38

Slide 38 text

Network of collab. bet. urban areas

Slide 39

Slide 39 text

Application case Go on page: https://geoscimo.univ-tlse2.fr/analysis-of-the- ectocarpus-corpus-march-2022/

Slide 40

Slide 40 text

Seaweeds and algae research Topic query: TS = (seaweed* OR alga OR kelp* OR algae* OR algal* OR seagrass* OR sea plant* OR phyco* OR Chlorell* OR Protothec* OR Charophy* OR Chlorophyt* OR Rhodophy* OR Cryptophy* OR Haptophy* OR Charophy* OR Chlorarachniophy* OR Glaucophy* OR rockweed* OR dulse* OR dillisk* OR dilsk* OR carragheen moss* OR sea lettuce* OR Chondrus) Should we add: microalga, macroalga, phytoplankton, cyanobacteria? Else? See: https://www.sciencedirect.com/science/article/pii/B9780128170762000135

Slide 41

Slide 41 text

The scientific production on algae and seaweeds in Scotland

Slide 42

Slide 42 text

Distribution of Scotland’s publications on algae per scientific specialities

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

Prospects • Collaborative space: internal repository for the project with useful datasets and scripts • R workshop and tutorial (tidyverse + cartigraph) • Issue of corpus delineation

Slide 47

Slide 47 text

Reference Marion Maisonobe, Laurent Jégou, Nikita Yakimovich, Guillaume Cabanac (2019). NETSCITY: a geospatial application to analyse and map world scale production and collaboration data between cities. In ISSI’19: Proceedings of the 17th International Conference on Scientometrics and Informetrics, Tome 1, p. 631-642, Rome: Edizioni Efesto. [PDF]