Slide 1

Slide 1 text

Stats of 2020.02.08

Slide 2

Slide 2 text

Summary nA survey of papers submitted to the arXiv uIn terms of disciplines, the growth of information science is remarkable since 2017 uMore than 40% of the submitted papers have DOIs. p There is a possibility that more than 40% of the papers are eventually accepted for publication. p However, there is a large bias in each field, with informatics accounting for about 20% of submissions. p The interval to publication varies greatly by field, but it takes about 6 months from the time of arXiv registration, and even in fields that take a long time, it takes about a year or less. 2

Slide 3

Slide 3 text

Data desc. narXiv uCollect all items that can be collected as of Jan 21, 2020 through the API. ntotal number of data : 1,622,763. uItems: Title, Abstract, Author, Field, DOI, etc. uPeriod: Apr 25, 1986* - Jan 17, 2020 pAlso obtain cited references through the Semantic Scholar pIf a DOI is assigned... • Using CrossRef's API, journal name, publication date, etc. are also collected separately. 3 * The arXiv started in 1991, but some of the submission dates are earlier than that.

Slide 4

Slide 4 text

Number of recorded data Field has/hasn’t DOI 4 * It's not cumulative. Only the first field is counted.

Slide 5

Slide 5 text

12 Fields of this slides 5 Meta Class Description astro-ph Astrophysics cond-mat Material cs Computer Science econ Economics hep Hi-energy Pysics math Math nlin Non-Linier nucl NewClear physics Physics q-bio Biology q-fin Finance stat Statistics Restructuring of arXiv's 153 fields into 12 categories For details, see Appendix

Slide 6

Slide 6 text

Research grants/award in DOI information 6 Number of papers with DOI with Award information with Award information including "Japan" in the list

Slide 7

Slide 7 text

Rate of DOI granting per field n Calculations are based on five years of submissions from 2014 to 2018. 7 A small percentage of mathematics and computer science fields are published in journals

Slide 8

Slide 8 text

Time from submission to publication with DOI n Calculations are based on 18 years of submissions between 2000 and 2017. 8 The time between preprint submission and journal publication is long in the field of mathematics.

Slide 9

Slide 9 text

Top 5 DOI recipients by field 9 ctg title count astro-ph The Astrophysical Journal 66168 astro-ph Monthly Notices of the Royal Astronomical Society 46747 astro-ph Physical Review D 34640 astro-ph Astronomy & Astrophysics 29896 astro-ph Journal of Cosmology and Astroparticle Physics 9880 cond-mat Physical Review B 74769 cond-mat Physical Review Letters 34033 cond-mat Physical Review E 20297 cond-mat Physical Review A 11216 cond-mat Applied Physics Letters 6801 cs Electronic Proceedings in Theoretical Computer Science 3983 cs IEEE Transactions on Signal Processing 1143 cs IEEE Transactions on Information Theory 1060 cs Logical Methods in Computer Science 583 cs IEEE Transactions on Wireless Communications 488 hep Physical Review D 65614 hep Journal of High Energy Physics 33701 hep Physics Letters B 27796 hep Nuclear Physics B 14027 hep Physical Review Letters 10340 math Journal of Mathematical Physics 7674 math Communications in Mathematical Physics 6842 math Journal of Physics A: Mathematical and Theoretical 6132 math Journal of Statistical Physics 2962 math Journal of High Energy Physics 2757 ctg title count nlin Physical Review E 4411 nlin Physical Review Letters 1942 nlin Journal of Physics A: Mathematical and Theoretical 940 nlin Journal of Physics A: Mathematical and General 815 nlin Physics Letters A 805 nucl Physical Review C 14503 nucl Physical Review D 4835 nucl Nuclear Physics A 4555 nucl Physics Letters B 4094 nucl Physical Review Letters 3130 physics Physical Review A 27012 physics Physical Review Letters 14807 physics Physical Review E 8539 physics Physical Review B 5882 physics New Journal of Physics 3941 q-bio Physical Review E 1836 q-bio Physical Review Letters 411 q-bio PLoS ONE 317 q-bio The Journal of Chemical Physics 269 q-bio PLoS Computational Biology 231 q-fin Physica A: Statistical Mechanics and its Applications 647 q-fin Physical Review E 119 stat The Annals of Statistics 1385 stat The Annals of Applied Statistics 897 stat Bernoulli 524 stat Statistical Science 411 stat IEEE Transactions on Signal Processing 208 1986-2020

Slide 10

Slide 10 text

Number of citations per field n Calculations are based on five years of submissions from 2014 to 2018. 10 The field of computer science has a high number of citations.

Slide 11

Slide 11 text

Number of citations per field (Top 15) 11 aid date category title cite 1 1502.03167v3 2015-02 cs.LG Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 9999 2 1409.4842v1 2014-09 cs.CV Going Deeper with Convolutions 9998 3 1201.0490v4 2012-01 cs.LG|cs.MS Scikit-learn: Machine Learning in Python 9997 4 1310.4546v1 2013-10 cs.CL|cs.LG|stat.ML Distributed Representations of Words and Phrases and their Compositionality 9997 5 1409.1556v6 2014-09 cs.CV Very Deep Convolutional Networks for Large-Scale Image Recognition 9996 6 1412.6980v9 2014-12 cs.LG Adam: A Method for Stochastic Optimization 9996 7 1512.03385v1 2015-12 cs.CV Deep Residual Learning for Image Recognition 9996 8 1409.0575v3 2014-09 cs.CV|I.4.8; I.5.2 ImageNet Large Scale Visual Recognition Challenge 9994 9 1506.01497v3 2015-06 cs.CV Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 9994 10 1301.3781v3 2013-01 cs.CL Efficient Estimation of Word Representations in Vector Space 8977 11 1408.5093v1 2014-06 cs.CV|cs.LG|cs.NE Caffe: Convolutional Architecture for Fast Feature Embedding 8977 12 1409.0473v7 2014-09 cs.CL|cs.LG|cs.NE|stat.ML Neural Machine Translation by Jointly Learning to Align and Translate 8727 13 1406.5823v1 2014-06 stat.CO Fitting Linear Mixed-Effects Models using lme4 8708 14 1311.2524v5 2013-11 cs.CV Rich feature hierarchies for accurate object detection and semantic segmentation 8145 15 1505.04597v1 2015-05 cs.CV U-Net: Convolutional Networks for Biomedical Image Segmentation 7797 Highly cited papers are biased toward information science fields

Slide 12

Slide 12 text

Frequency by number of citations 12 2014-2018

Slide 13

Slide 13 text

Frequency by number of citations (by field) 13 2014-2018

Slide 14

Slide 14 text

Frequency by number of citations (by field) 14 2014-2018

Slide 15

Slide 15 text

Frequency by number of citations (by field) 15 2014-2018

Slide 16

Slide 16 text

Time from publication to citation (years) 16 2011-2015

Slide 17

Slide 17 text

Time from publication to citation (years) 17 2011-2015

Slide 18

Slide 18 text

Time from publication to citation (years) 18 2011-2015

Slide 19

Slide 19 text

Time from publication to citation (years) 19 2011-2015

Slide 20

Slide 20 text

Years from publication to citation 20 1986-2020

Slide 21

Slide 21 text

Percentage of email addresses detected by category 21 n Approximately 75% of papers include a contact email address. Linking regions to papers based on email addresses If there are multiple email addresses, use only the first one. Gmail and Hotmail are classified as unknown.

Slide 22

Slide 22 text

US nFirst or second place in all fields nComputer Science is the field with the largest number u Artificial Intelligence (cs.LG, stat.ML, cs.AI) dominate in number 22 3BOL $PVOU 1DU 1DU 0WFS /POF DBUFHPSZ DOU SBOL DT-( TUBU.- DT$7 DT$- DT"* NBUI0$ DPOENBUNUSMTDJ DT4: DT30 BTUSPQI(" FFTT41 DT$3 DPOENBUNFTIBMM BTUSPQI)& DT*5 Distribution of rankings in 153 fields High ranking and number fields Ranked 1st in 97 of 153 fields Fields with zero publications Fields below 10th place

Slide 23

Slide 23 text

Japan 23 3BOL $PVOU 1DU 1DU 0WFS /POF DBUFHPSZ DOU SBOL OMJO$( DPOENBUTVQSDPO NBUI(5 IFQMBU RCJP$# DPOENBUTUSFM DPOENBUTUBUNFDI OVDMUI NBUI"$ NBUI"5 NBUI0" NBUI4( NBUI(/ DPOENBUPUIFS DT.. RGJO45 RCJP./ DPOENBUNUSMTDJ NBUI"( DPOENBUNFTIBMM NBUI%( NBUI35 Ratio of the number of papers in the U.S. to 1

Slide 24

Slide 24 text

China 24 3BOL $PVOU 1DU 1DU 0WFS /POF DBUFHPSZ DOU SBOL IFQQI OVDMUI DPOENBUTVQSDPO DPOENBURVBOUHBT DT.. DT$7 RVBOUQI DPOENBUNUSMTDJ DPOENBUNFTIBMM DT*5 NBUI*5 FFTT41 QIZTJDTPQUJDT NBUI"1 HSRD FFTT*7 DPOENBUTUSFM QIZTJDTBQQQI IFQFY DT/* Ratio of the number of papers in the U.S. to 1

Slide 25

Slide 25 text

3BOL $PVOU 1DU 1DU 0WFS /POF France 25 Ratio of the number of papers in the U.S. to 1 DBUFHPSZ DOU SBOL RGJO$1 DT4$ NBUI13 NBUIQI NBUI.1 NBUI/5 NBUI45 TUBU5) DPOENBUEJTOO NBUI41 QIZTJDTDMBTTQI QIZTJDTHFPQI DT'- QIZTJDTBPQI RCJP50 NBUI)0 DT.. RGJO(/ RCJP$# RCJP4$ RCJP05 OMJO$(

Slide 26

Slide 26 text

3BOL $PVOU 1DU 1DU 0WFS /POF Germany 26 Ratio of the number of papers in the U.S. to 1 DBUFHPSZ DOU SBOL DT.4 QIZTJDTBUNDMVT NBUI/" DPOENBUTPGU QIZTJDTDIFNQI QIZTJDTJOTEFU QIZTJDTBUPNQI DT-0 IFQMBU QIZTJDTCJPQI DT%. NBUI"5 DT$& QIZTJDTBDDQI NBUI,5 DT'- DT%- RCJP$# QIZTJDTQPQQI DT(- OMJO$(

Slide 27

Slide 27 text

3BOL $PVOU 1DU 1DU 0WFS /POF UK 27 Ratio of the number of papers in the U.S. to 1 DBUFHPSZ DOU SBOL RGJO3. RCJP$# RGJO.' RGJO53 RGJO1. RGJO13 RCJP05 BTUSPQI(" BTUSPQI$0 BTUSPQI43 BTUSPQI&1 QIZTJDTGMVEZO TUBU.& TUBU"1 DT$: NBUI(3 DT." RCJP/$ TUBU$0 RCJP2. NBUI"5 RCJP1& FDPO&. DT&5

Slide 28

Slide 28 text

3BOL $PVOU 1DU 1DU 0WFS /POF Italy 28 Ratio of the number of papers in the U.S. to 1 DBUFHPSZ DOU SBOL OMJO$( DT%- QIZTJDTFEQI QIZTJDTQPQQI DT(- OMJO"0 QIZTJDTDMBTTQI QIZTJDTTQBDFQI RCJP$# RCJP05 RGJO13 TUBU05 BTUSPQI)& NBUIQI NBUI"$ NBUI"1 NBUI$7 NBUI(/ NBUI)0 NBUI.1 QIZTJDTIJTUQI QIZTJDTJOTEFU QIZTJDTTPDQI RGJO(/ RGJO.' RGJO3. RGJO45

Slide 29

Slide 29 text

Apdx: Correspondence between 153 fields and 12 categories 29 Meta Class Class Description astro-ph astro-ph Astrophysics astro-ph astro-ph.CO Cosmology and Nongalactic Astrophysics astro-ph astro-ph.EP Earth and Planetary Astrophysics astro-ph astro-ph.GA Astrophysics of Galaxies astro-ph astro-ph.HE High Energy Astrophysical Phenomena astro-ph astro-ph.IM Instrumentation and Methods for Astrophysics astro-ph astro-ph.SR Solar and Stellar Astrophysics astro-ph gr-qc General Relativity and Quantum Cosmology cond-mat cond-mat.dis-nn Disordered Systems and Neural Networks cond-mat cond-mat.mes-hall Mesoscale and Nanoscale Physics cond-mat cond-mat.mtrl-sci Materials Science cond-mat cond-mat.other Other Condensed Matter cond-mat cond-mat.quant-gas Quantum Gases cond-mat cond-mat.soft Soft Condensed Matter cond-mat cond-mat.stat-mech Statistical Mechanics cond-mat cond-mat.str-el Strongly Correlated Electrons cond-mat cond-mat.supr-con Superconductivity

Slide 30

Slide 30 text

Apdx: Correspondence between 153 fields and 12 categories 30 Meta Class Class Description cs cs.AI Artificial Intelligence cs cs.AR Hardware Architecture cs cs.CC Computational Complexity cs cs.CE Computational Engineering, Finance, and Science cs cs.CG Computational Geometry cs cs.CL Computation and Language cs cs.CR Cryptography and Security cs cs.CV Computer Vision and Pattern Recognition cs cs.CY Computers and Society cs cs.DB Databases cs cs.DC Distributed, Parallel, and Cluster Computing cs cs.DL Digital Libraries cs cs.DM Discrete Mathematics cs cs.DS Data Structures and Algorithms cs cs.ET Emerging Technologies cs cs.FL Formal Languages and Automata Theory cs cs.GL General Literature cs cs.GR Graphics cs cs.GT Computer Science and Game Theory cs cs.HC Human-Computer Interaction cs cs.IR Information Retrieval

Slide 31

Slide 31 text

Apdx: Correspondence between 153 fields and 12 categories 31 Meta Class Class Description cs cs.IT Information Theory cs cs.LG Learning cs cs.LO Logic in Computer Science cs cs.MA Multiagent Systems cs cs.MM Multimedia cs cs.MS Mathematical Software cs cs.NA Numerical Analysis cs cs.NE Neural and Evolutionary Computing cs cs.NI Networking and Internet Architecture cs cs.OH Other Computer Science cs cs.OS Operating Systems cs cs.PF Performance cs cs.PL Programming Languages cs cs.RO Robotics cs cs.SC Symbolic Computation cs cs.SD Sound cs cs.SE Software Engineering cs cs.SI Social and Information Networks cs cs.SY Systems and Control cs eess.AS Audio and Speech Processing cs eess.IV Image and Video Processing cs eess.SP Signal Processing

Slide 32

Slide 32 text

Apdx: Correspondence between 153 fields and 12 categories 32 Meta Class Class Description econ econ.EM Econometrics hep hep-ex High Energy Physics - Experiment hep hep-lat High Energy Physics - Lattice hep hep-ph High Energy Physics - Phenomenology hep hep-th High Energy Physics - Theory

Slide 33

Slide 33 text

Apdx: Correspondence between 153 fields and 12 categories 33 Meta Class Class Description math math-ph Mathematical Physics math math.AC Commutative Algebra math math.AG Algebraic Geometry math math.AP Analysis of PDEs math math.AT Algebraic Topology math math.CA Classical Analysis and ODEs math math.CO Combinatorics math math.CT Category Theory math math.CV Complex Variables math math.DG Differential Geometry math math.DS Dynamical Systems math math.FA Functional Analysis math math.GM General Mathematics math math.GN General Topology math math.GR Group Theory math math.GT Geometric Topology math math.HO History and Overview math math.IT Information Theory math math.KT K-Theory and Homology math math.LO Logic

Slide 34

Slide 34 text

Apdx: Correspondence between 153 fields and 12 categories 34 Meta Class Class Description math math.MG Metric Geometry math math.MP Mathematical Physics math math.NA Numerical Analysis math math.NT Number Theory math math.OA Operator Algebras math math.OC Optimization and Control math math.PR Probability math math.QA Quantum Algebra math math.RA Rings and Algebras math math.RT Representation Theory math math.SG Symplectic Geometry math math.SP Spectral Theory math math.ST Statistics Theory

Slide 35

Slide 35 text

Apdx: Correspondence between 153 fields and 12 categories 35 Meta Class Class Description nlin nlin.AO Adaptation and Self-Organizing Systems nlin nlin.CD Chaotic Dynamics nlin nlin.CG Cellular Automata and Lattice Gases nlin nlin.PS Pattern Formation and Solitons nlin nlin.SI Exactly Solvable and Integrable Systems nucl nucl-ex Nuclear Experiment nucl nucl-th Nuclear Theory

Slide 36

Slide 36 text

Apdx: Correspondence between 153 fields and 12 categories 36 Meta Class Class Description physics physics.acc-ph Accelerator Physics physics physics.ao-ph Atmospheric and Oceanic Physics physics physics.app-ph Applied Physics physics physics.atm-clus Atomic and Molecular Clusters physics physics.atom-ph Atomic Physics physics physics.bio-ph Biological Physics physics physics.chem-ph Chemical Physics physics physics.class-ph Classical Physics physics physics.comp-ph Computational Physics physics physics.data-an Data Analysis, Statistics and Probability physics physics.ed-ph Physics Education physics physics.flu-dyn Fluid Dynamics physics physics.gen-ph General Physics physics physics.geo-ph Geophysics physics physics.hist-ph History and Philosophy of Physics physics physics.ins-det Instrumentation and Detectors physics physics.med-ph Medical Physics physics physics.optics Optics physics physics.plasm-ph Plasma Physics physics physics.pop-ph Popular Physics physics physics.soc-ph Physics and Society physics physics.space-ph Space Physics physics quant-ph Quantum Physics

Slide 37

Slide 37 text

Apdx: Correspondence between 153 fields and 12 categories 37 Meta Class Class Description q-bio q-bio.BM Biomolecules q-bio q-bio.CB Cell Behavior q-bio q-bio.GN Genomics q-bio q-bio.MN Molecular Networks q-bio q-bio.NC Neurons and Cognition q-bio q-bio.OT Other Quantitative Biology q-bio q-bio.PE Populations and Evolution q-bio q-bio.QM Quantitative Methods q-bio q-bio.SC Subcellular Processes q-bio q-bio.TO Tissues and Organs q-fin q-fin.CP Computational Finance q-fin q-fin.EC Economics q-fin q-fin.GN General Finance q-fin q-fin.MF Mathematical Finance q-fin q-fin.PM Portfolio Management q-fin q-fin.PR Pricing of Securities q-fin q-fin.RM Risk Management q-fin q-fin.ST Statistical Finance q-fin q-fin.TR Trading and Market Microstructure

Slide 38

Slide 38 text

Apdx: Correspondence between 153 fields and 12 categories 38 Meta Class Class Description stat stat.AP Applications stat stat.CO Computation stat stat.ME Methodology stat stat.ML Machine Learning stat stat.OT Other Statistics stat stat.TH Statistics Theory