(KNAW), The Hague • Amsterdam University • Royal Library, The Hague • Data Archiving and Networked Services (DANS) (KNAW), The Hague Funding: • NWO – (Dutch governmental funding) • Additional: CLARIN NL & CLARIN-EU
structure different sets of letters of 17th-century scholars, in such a way that the production, circulation and use of knowledge can be analyzed and visualized in a wider international context?
structure different sets of letters of 17th-century scholars, in such a way that the production, circulation and use of knowledge can be analyzed and visualized in a wider international context? • Can we recognize the themes and stakeholders that were important at the time in the scholarly debates in space and time?
505 Isaac Beeckman 21 René Descartes 727 Hugo de Groot (Grotius) 8.034 Christiaan Huygens 3.080 Constantijn Huygens 7.119 Antoni van Leeuwenhoek 282 Dirck Rembrantsz van Nierop 80 Jan Swammerdam 172 Total 20.020
editions • Corpus is not uniform in language (Latin, English, German, French, Italian, Dutch) • No uniformity in spelling • No uniformity in available metadata !!! • What to do with figures and formula? • Etcetera …. • CURATION REQUIERED A LOT OF (HAND-)WORK
normalization. • Removal of definite and indefinite articles, conjugations, etc. • Topic modelling and keyword analysis. • Named Entity Recognition. • Network analysis. • Co-citation module. • Several visualization modules. • … • (most work is done by Dr. Walter Ravenek, our major ICT-developer)
topic is a group of words. The co-occurrence of a word with a topic can be mathematically analyzed. Topics are used to identify in the corpus : • Similar words • comparable documents • Documents that resemble a text fragment that the researcher offers to the tool • First experiments were carried out with Latent Dirichlet Allocation (LDA)
model, proposed in 2003, that explains why some parts of the data are related. LDA is a variant of Probabilistic latent semantic analysis (PLSA), a statistical technique for the analysis of the relations between data. However, LDA disregards the mutual proximity of words..
me the texts; than I can figure it out myself. • The tool provides no suggestions which I recognize as being important. • I cannot recognize words that the tool presents as existing in the text. • I can not search the texts in similar way as in Google.
‘topic modelling’, in combination with language technology. (e.g. mutual proximity of words) • The possibilities of how to search were enhanced. • Historical researchers were more involved in the development of the tool (frequent test sessions). • Several suggestions for a user interface and for visualizations were followed.
(unfortunately in many cases far too small for really good results). • Facetted search & ‘google search’. • The tool provides new suggestions after each search. • The possibility of looking for related paragraphs. • Visualizations & co-citations. • No ‘collaboratory’. • No annotations. • But very convenient to trace some discussions.
in many cases far too small for really good results). • Facetted search & ‘google search’. • The tool provides new suggestions after every search. • The possibility of looking for related paragraphs. • Visualizations & co-citations. • No ‘collaboratory’ . • No annotations. • But very convenient to trace some discussions.
short: when a number of persons are found together in the same paragraph, this coincidence is coined a ‘co-citation’. • Since the introduction of ‘co-citation’ (in the years 1970), co-citation analysis has become an important method for the study of the structure of a scientific debate.
from each paragraph in the corpus of letters. • This has been made possible by the use of software driven ‘Named-Entity Recognition’, to be followed by semi- automatical identification of the names of persons in the letters.
functionality, I have looked to whom the ePistolarium identifies as the mayor figures in the scientific debate concerning the discovery of the ring structure and the moon of Saturn.
functionality, I have looked to whom the ePistolarium identifies as the mayor figures in the scientific debate concerning the discovery of the ring structure and the moon of Saturn. • In this debate Christiaan Huygens has played a crucial role.
functionality, I have looked to whom the ePistolarium identifies as the mayor figures in the scientific debate concerning the discovery of the ring structure and the moon of Saturn. • In this debate Christiaan Huygens has played a crucial role. . • Christiaan’s correspondence is part of the ePistolarium. So this case is very suited to test the co-citation functionality in the ePistolarium tool.
knowledge is necessary to evaluate the reliability of the ePistolarium as a research tool in such a complicated corpus of letters in various languages, partly also in ancient spelling.
2 & 3 (Iapetus & Rhea) VI. 1675 Discovery of the separation in Saturn’s ring (Cassini - division) VII. 1684 Cassini discovers Moons 4 & 5 (Tethys & Dione)
Saturni" NOT "diem Saturni" NOT "diebus Saturni" • 6 letters • Only Galilei is mentioned as Saturn´s first observer. • Problem: Saturn can also be used in an astrological context
Descartes - provided theoretical framework 2. Boulliau - Observer & astronomer 3. Gassendi - Observer & astronomer 4. Hevelius - Observer & astronomer 5. De Montmort - mathematician; leader of the ‘Montmor Academy’ 6. De Roberval - mathematician with his own theory of Saturn 7. Heinsius - Go-between 8. Van Schooten - Go-between. 9. Fermat - mathematician – worked on ellipses 10. Pascal - mathematician – worked on ellipses
Boulliau - Observer & astronomer 2. Hevelius - Observer & astronomer 3. Huygens - Discussed about Saturn’s ring 4. De Bessy - Discussed about Saturn’s ring 5. Wren - Discussed about Saturn’s ring 6. Fabri - Opposed Huygens’ observations 7. Divini - Telescope maker; opposed Huygens’ observations 8. Boyle - Go-between 9. De Medici - Dedication of the ‘Systema Saturni’ 10. Thévenot - Leader of a informal scholarly society in Paris 11. Vossius - Go-between
Cassini - astronomer (discoverer of the moons of Saturn) 2. Campani - constructor of large telescopes 3. Colbert - Appointed Cassini at the Observatoire 4. Wallis - mathematician; discussed the shape of Saturn 5. Descartes - provided cosmological model of explanation 6. Copernicus - provided cosmological model of explanation 7. La Hire - French astronomer 8. Alhazen - Arab natural philosopher & mathematician 9. Sluze - mathematician; worked on curves 10. Catelan - author of a book on scales 11. Mariotte - ‘physicist’ (worked on watches) 12. Galois - ‘physicist’ (worked on mechanics)
important players in the discussions on Saturn. • In a few cases person not mentioned earlier in the literature on Saturn, still seems to have been involved in some way.
can: • provide an acceleration and enlargement of the possibilities for research in the humanities. • generate (in the future) new questions and answers. • offer a quick visual overview of relevant stakeholders and places (now already) • …..
small for relevant results. • What is not digital available, is missed! • Expectations must not be too high (the question must fit the tool). • Digital maintenance and expansion of project financed tools have not been guaranteed satisfactory (at least in our case). • Standardisation of metadata is essential. • International cooperation is required. • …..
data enrichment and annotation and new user interfaces in a virtual research environment • Linking letters to other: documents such as: • Notes, working papers of scholars: • Grotius Information Master • Early periodicals to study the impact of the letter format on the development of the periodical • Adding more letters and metadata to create a critical mass necessary to test/falsify existing theories of the Republic of Letters and to develop new questions