Slide 1

Slide 1 text

Long-term Progress of DOI Links on Wikipedia 1 A-LIEP 2023 – Session 1: Organization of Data, Information, and Knowledge Jiro Kikkawa Masao Takaku Fuyuki Yoshikane { jiro, masao, fuyuki } @ Comparative Analysis of English and Japanese Wikipedia from 2015 to 2023 University of Tsukuba, Japan

Slide 2

Slide 2 text

2 Background #1 Mass digitization of scholarly communication • Various communities and people, including non-traditional readers, such as researchers and specialists can utilize scholarly documents. • Wikipedia offers numerous references and access to scholarly documents Scholarly references on Wikipedia • complement and improve the quality of Wikipedia content. Difficulties defining LIS "The question, 'What is library and information science?' does not elicit responses [...] Chua & Yang (2008) [10] studied papers published in Journal of the American Society for Information Science and Technology in the period 1988–1997 and found, among other things: "Top authors have grown in diversity from those being affiliated predominantly with library/information-related departments to include those from information systems management, information technology, business, and the humanities. […] " References 1. Bates, M.J. and Maack, M.N. (eds.). (2010). Encyclopedia of Library and Information Sciences. Vol. 1–7. CRC Press, Boca Raton, USA. Also available as an electronic source. […] 10. Chua, Alton Y.K.; Yang, Christopher C. (November 2008). "The shift towards multi- disciplinarity in information science". Journal of the American Society for Information Science and Technology. 59 (13): 2156– 2170. doi:10.1002/asi.20929. Example of the scholarly reference on English Wikipedia. Library and information science - Wikipedia

Slide 3

Slide 3 text

3 Background #2 Few studies focused on long-term observation and analysis of scholarly communication on the Web • Traditional citation analysis in the field of bibliometrics has focused on long-term analysis of citation relationships among schoarly articles (e.g., cited half life / citation half life) • However, including the long-term trends in scholarly references on Wikipedia, long-term observation and analysis of scholarly communication through the Web is still lacking

Slide 4

Slide 4 text

4 Background #3 Most DOI links on Japanese Wikipedia articles were imported through the translation from English Wikipedia articles Result: Overlap analysis of unique DOI links between two language Wikipedias 24 Target jawiki - enwiki enwiki - jawiki difference set 5,259 499,551 % 20.7 96.1 product set 20,185 20,185 % 79.3 3.9 total 25,444 519,736 % 100.0 100.0 DOI links in jawiki DOI links in enwiki Kikkawa, Takaku, & Yoshikane, 2016 (ICADL 2016) • As of March 2015, out of 25,444 unique DOI links referenced on Japanese Wikipedia (jawiki), 79.3% (20,185 DOI links) were overlapped with ones referenced on English Wikipedia (enwiki) • Majority of these DOI links were added by translating from English Wikipedia articles

Slide 5

Slide 5 text

5 Background #3 Most DOI links on Japanese Wikipedia articles were imported through the translation from English Wikipedia articles • Whether this trend has continued over time is unclear • The number of DOIs registered by Japan Link Center, the only DOI registration agency in Japan, has been increasing rapidly since 2016 - 3 million DOIs as of January 2016 ˠ 11 million DOIs as of October 2023 - The number of scholarly references corresponding to these DOIs could be growing in Japanese Wikipedia We clarify long-term progress in scholarly references on Wikipedia through a comparative analysis of DOI links referenced on English and Japanese Wikipedia as of March 2015 and July 2023

Slide 6

Slide 6 text

6 We clarify long-term progress in scholarly references on Wikipedia through a comparative analysis of DOI links referenced on English and Japanese Wikipedia as of March 2015 and July 2023 Purpose We examined the following topics: 1. changes in the number of DOI links and Wikipedia articles containing them 2. changes in the overlap of DOI links within/between English and Japanese Wikipedia from 2015 to 2023 3. the top contents highly referenced on English and Japanese Wikipedia in 2015 and 2023 4. the relationship between DOI registration year and published year of each content and addition to Wikipedia articles

Slide 7

Slide 7 text

Materials and Methods #1 7 Wikipedia dump files Article title Cardigan Welsh Corgi DOI 10.1093/jhered/esn085 Step 1 Article title Kominkan DOI 10.1241/johokanri.54.808 • We extracted DOI links and Wikipedia articles containing them from Wikipedia using dump files • We filtered the target to the DOI links referenced in the main namespace, i.e., the namespace for encyclopedia articles. • The target was Japanese Wikipedia as of March 13th, 2015, and July 1st, 2023, and English Wikipedia as of March 4th, 2015, and July 1st, 2023, respectively

Slide 8

Slide 8 text

Materials and Methods #2 8 Which RA? Step 2 Wikipedia dump files Article title Cardigan Welsh Corgi DOI 10.1093/jhered/esn085 Step 1 Article title Kominkan DOI 10.1241/johokanri.54.808 DOI RA 10.1093/jhered/esn085 Crossref 10.1241/johokanri.54.808 JaLC 10.13026/9cft-hg92 DataCite • We identified the Registration Agency (RA) for each DOI and removed invalid DOIs using the Web API “Which RA?”

Slide 9

Slide 9 text

Materials and Methods #3 9 Which RA? Step 2 Wikipedia dump files Article title Cardigan Welsh Corgi DOI 10.1093/jhered/esn085 Step 1 Article title Kominkan DOI 10.1241/johokanri.54.808 DOI RA 10.1093/jhered/esn085 Crossref 10.1241/johokanri.54.808 JaLC 10.13026/9cft-hg92 DataCite Bibliographic metadata Step 3 Crossref metadata Journal title, Paper title, Author names, Published year, Pages, ISSN numbers ...etc. • We obtained Crossref metadata for each DOI whose RA was Crossref (Crossref DOI) using the Crossref REST API and extracted ISSN numbers

Slide 10

Slide 10 text

Materials and Methods #4 10 Which RA? ESI journal list ISSN ESI category 0022-1503 Molecular Biology & Genetics 1465-7333 Molecular Biology & Genetics Step 2 Step 4 Wikipedia dump files Article title Cardigan Welsh Corgi DOI 10.1093/jhered/esn085 Step 1 Article title Kominkan DOI 10.1241/johokanri.54.808 DOI RA 10.1093/jhered/esn085 Crossref 10.1241/johokanri.54.808 JaLC 10.13026/9cft-hg92 DataCite Bibliographic metadata Step 3 Crossref metadata Journal title, Paper title, Author names, Published year, Pages, ISSN numbers ...etc. • We associated the research fields of the Essential Science Indicators (ESI) categories with each DOI by matching ISSN numbers to the ESI journal list

Slide 11

Slide 11 text

Results and Discussion 11

Slide 12

Slide 12 text

1. Changes in the number of DOI links and Wikipedia articles containing them #1 12 English Wikipedia Japanese Wikipedia 2015 2023 Growth 2015 2023 Growth # of total DOIs 1,467,903 2,491,399 169.73% 28,509 206,819 725.43% # of unique DOIs 515,441 1,696,943 329.22% 25,185 149,681 594.33% # of unique Wikipedia articles 165,897 569,364 343.20% 9,665 45,862 475.52% Table. Basic statistics of the datasets • The total and unique DOIs and the number of corresponding Wikipedia articles increased consistently from 2015 to 2023 for both English and Japanese Wikipedia • The growth in total numbers of DOIs in Japanese Wikipedia is 725.43%, showing a noticeable increase

Slide 13

Slide 13 text

1. Changes in the number of DOI links and Wikipedia articles containing them #2 13 English Wikipedia Japanese Wikipedia Registration Agency 2015 2023 Difference 2015 2023 Difference Airiti 2 223 221 0 12 12 CNKI 0 520 520 0 22 22 Crossref 1,462,704 2,460,860 998,156 26,202 144,989 118,787 Crossref via JaLC 3,839 5,216 1,377 1,732 24,669 22,937 DataCite 452 19,700 19,248 13 465 452 DataCite via JaLC 0 0 0 0 1 1 EIDR 0 100 100 0 0 0 ISTIC 95 652 557 0 32 32 JaLC 11 1,130 1,119 551 36,442 35,891 KISTI 49 447 398 2 64 62 mEDRA 220 1,817 1,597 1 75 74 OP 178 716 538 2 33 31 Public 353 18 -335 6 15 9 Table. Total number of DOIs by Registration Agencies on English and Japanese Wikipedia • With the exception of “Public” on English Wikipedia, the total number of DOIs registered by each Registration Agency increased from 2015 to 2023

Slide 14

Slide 14 text

1. Changes in the number of DOI links and Wikipedia articles containing them #3 14 English Wikipedia Japanese Wikipedia Registration Agency 2015 2023 Difference 2015 2023 Difference Airiti 2 223 221 0 12 12 CNKI 0 520 520 0 22 22 Crossref 1,462,704 2,460,860 998,156 26,202 144,989 118,787 Crossref via JaLC 3,839 5,216 1,377 1,732 24,669 22,937 DataCite 452 19,700 19,248 13 465 452 DataCite via JaLC 0 0 0 0 1 1 EIDR 0 100 100 0 0 0 ISTIC 95 652 557 0 32 32 JaLC 11 1,130 1,119 551 36,442 35,891 KISTI 49 447 398 2 64 62 mEDRA 220 1,817 1,597 1 75 74 OP 178 716 538 2 33 31 Public 353 18 -335 6 15 9 Table. Total number of DOIs by Registration Agencies on English and Japanese Wikipedia • The diversities of Registration Agencies for DOI links increased in both Wikipedia from 2015 to 2023.

Slide 15

Slide 15 text

1. Changes in the number of DOI links and Wikipedia articles containing them #3 15 English Wikipedia Japanese Wikipedia Registration Agency 2015 2023 Difference 2015 2023 Difference Crossref 1,462,704 2,460,860 998,156 26,202 144,989 118,787 Crossref via JaLC 3,839 5,216 1,377 1,732 24,669 22,937 DataCite 452 19,700 19,248 13 465 452 DataCite via JaLC 0 0 0 0 1 1 EIDR 0 100 100 0 0 0 ISTIC 95 652 557 0 32 32 JaLC 11 1,130 1,119 551 36,442 35,891 Table. Total number of DOIs by Registration Agencies on English and Japanese Wikipedia n As for the difference in the number of references, the highest number was Crossref DOIs in English and Japanese Wikipedia n The contents of Japanese publishers and academic societies on Japanese Wikipedia is particularly increasing from 2015 to 2023 * Highlighted in Yellow shows the items in which Japan Link Center (JaLC) is involved in the DOI registration

Slide 16

Slide 16 text

2. Overlaps of unique DOI links within the same language version of Wikipedia #1 16 English Wikipedia 2015 - 2023 2023 - 2015 Difference set 35,365 6.86% 1,216,867 71.71% Product set 480,076 93.14% 480,076 28.29% Total 515,441 100.00% 1,696,943 100.00% Table. Overlaps of unique DOI links within the same language version of English Wikipedia as of 2015 and 2023 • As for English Wikipedia, 71.71% of all unique DOI links referenced as of 2023 were not referenced as of 2015 • 6.86% of all unique DOI links referenced as of 2015 were not referenced as of 2023 • → replacing and deleting DOI links were performed in English Wikipedia

Slide 17

Slide 17 text

2. Overlaps of unique DOI links within the same language version of Wikipedia #2 17 Japanese Wikipedia 2015 - 2023 2023 - 2015 Difference set 393 1.56% 124,889 83.44% Product set 24,792 98.44% 24,792 16.56% Total 25,185 100.00% 149,681 100.00% Table. Overlaps of unique DOI links within the same language version of Japanese Wikipedia as of 2015 and 2023 • As for Japanese Wikipedia, 83.44% of all unique DOI links referenced as of 2023 were not referenced as of 2015 • 1.56% of all unique DOI links referenced as of 2015 were not referenced as of 2023 • → Compared to English Wikipedia, replacing and deleting DOI links seem to be rarely performed

Slide 18

Slide 18 text

3. Overlaps of unique DOI links between English and Japanese Wikipedia 18 Japanese Wikipedia - English Wikipedia As of 2015 As of 2023 Difference set 5,023 19.94% 51,420 34.35% Product set 20,162 80.06% 98,261 65.65% Total 25,185 100.00% 149,681 100.00% Table. Overlaps of unique DOI links between English and Japanese Wikipedia as of 2015 and 2023 • The number of DOI links referenced on Japanese Wikipedia that do not overlap with English Wikipedia increased approximately 10 times (from 5,023 to 51,420) • Their share for the entire Japanese Wikipedia increased 14.41 percentage points (from 19.94% to 34.35%) → DOI links not derived from the translations of English Wikipedia articles have been increasing since 2015 in Japanese Wikipedia

Slide 19

Slide 19 text

4. Top 3 Highly referenced DOIs referenced on Japanese Wikipedia as of 2023 #1 19 # DOI Registration Agency Research field # of appearances # of Wikipedia articles Bibliographic information 1 10.11501/1873236 JaLC N/A 1,474 738 ೔ຊࠃ༗మಓఀं৔Ұཡ ত࿨41೥3݄ݱࡏ. ೔ຊࠃ༗మಓ. 2 10.11238/mammalians cience.58.s1 JaLC N/A 310 289 ઒ా৳Ұ࿠ ؠࠤਅ޺ ෱Ҫେ, et al. (2018). ੈքᄡೕྨඪ४࿨໊໨࿥ᄡೕྨՊֶ. 3 10.1051/0004- 6361/201833051 Crossref Space Science 115 64 Gaia Collaboration, Brown, A. G. A., Vallenari, A., et al. (2018). Gaia Data Release 2: Summary of the contents and survey properties. Astronomy & Astrophysics, 616, A1. Table. Top 3 highly referenced DOIs on Japanese Wikipedia as of 2023 • #1 and 2 are the contents written in Japanese • These DOIs are registered by JaLC (Japan Link Center)

Slide 20

Slide 20 text

4. Top 3 Highly referenced DOIs referenced on Japanese Wikipedia as of 2023 #2 20 # DOI Registration Agency Research field # of appearances # of Wikipedia articles Bibliographic information 1 10.11501/1873236 JaLC N/A 1,474 738 ೔ຊࠃ༗మಓఀं৔Ұཡ ত࿨41೥3݄ ݱࡏ. ೔ຊࠃ༗మಓ. 2 10.11238/mammalians cience.58.s1 JaLC N/A 310 289 ઒ా৳Ұ࿠ ؠࠤਅ޺ ෱Ҫେ, et al. (2018). ੈքᄡೕྨඪ४࿨໊໨࿥ᄡೕྨՊֶ. 3 10.1051/0004- 6361/201833051 Crossref Space Science 115 64 Gaia Collaboration, Brown, A. G. A., Vallenari, A., et al. (2018). Gaia Data Release 2: Summary of the contents and survey properties. Astronomy & Astrophysics, 616, A1. Table. Top 3 highly referenced DOIs on Japanese Wikipedia as of 2023 • “A list of stations of railways, automobiles, and sea routes as of March 1966)” published by the Japan National Railroad (currently, the Japan Railways) • The DOI assigned to the electronic version hosted by the digital collection of the National Diet Library in Japan is referenced as an information source for the office management codes of the stations

Slide 21

Slide 21 text

4. Top 3 Highly referenced DOIs referenced on Japanese Wikipedia as of 2023 #3 21 # DOI Registration Agency Research field # of appearances # of Wikipedia articles Bibliographic information 1 10.11501/1873236 JaLC N/A 1,474 738 ೔ຊࠃ༗మಓఀं৔Ұཡ ত࿨41೥3݄ݱࡏ. ೔ຊࠃ༗మಓ. 2 10.11238/mammalians cience.58.s1 JaLC N/A 310 289 ઒ా৳Ұ࿠ ؠࠤਅ޺ ෱Ҫେ, et al. (2018). ੈքᄡೕྨඪ४࿨໊໨࿥ᄡೕྨՊֶ. Table. Top 3 highly referenced DOIs referenced on Japanese Wikipedia as of 2023 • The Catalogue of standard Japanese names for the mammals of the world • This contents is referenced as a source of information on Japanese names in Japanese Wikipedia articles on mammal species

Slide 22

Slide 22 text

We conducted a comparative analysis of DOI links on English and Japanese Wikipedia as of March 2015 and July 2023 1. The total number of DOI links and the diversities of Registration Agencies for DOI links increased in both Wikipedia from 2015 to 2023 2. As for Japanese Wikipedia, the original DOI links, not derived from the translations from English Wikipedia, have been increasing since 2015. 3. Japanese Wikipedia as of 2023, two of the top three contents are JaLC DOIs, all written in Japanese → This result suggests that a rise in native language references on Japanese Wikipedia leads to a decrease in overlap with English Wikipedia 4. For DOI links added after 2015, majority of these references in both Wikipedia were the DOIs registered before 2015 Conclusion 22

Slide 23

Slide 23 text

Long-term Progress of DOI Links on Wikipedia 23 A-LIEP 2023 – Session 1: Organization of Data, Information, and Knowledge Jiro Kikkawa Masao Takaku Fuyuki Yoshikane { jiro, masao, fuyuki } @ Comparative Analysis of English and Japanese Wikipedia from 2015 to 2023 University of Tsukuba, Japan