Slide 1

Slide 1 text

Computer-Assisted Approaches in the Humanities Reconciling Computational and Classical Research Johann-Mattis List and Simon Greenhill Matchmaking Workshop, December 6/7, 2017, Berlin Max Planck Institute for the Science of Human History

Slide 2

Slide 2 text

What do biologists do, if they want to make an analysis involving data?

Slide 3

Slide 3 text

• They download the data from some server...

Slide 4

Slide 4 text

• They download the data from some server... • they rearrange the data according to their needs...

Slide 5

Slide 5 text

• They download the data from some server... • they rearrange the data according to their needs... • using simple command line tools...

Slide 6

Slide 6 text

• They download the data from some server... • they rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software...

Slide 7

Slide 7 text

• They download the data from some server... • they rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed...

Slide 8

Slide 8 text

• They download the data from some server... • they rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper...

Slide 9

Slide 9 text

• They download the data from some server... • they rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper... • and publish it, along with their derived dataset.

Slide 10

Slide 10 text

What do linguists do, if they want to make an analysis involving data?

Slide 11

Slide 11 text

• They hire a programmer to help them build a database...

Slide 12

Slide 12 text

• They hire a programmer to help them build a database... • the programmer creates an online interface, so they can easily insert the data...

Slide 13

Slide 13 text

• They hire a programmer to help them build a database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used...

Slide 14

Slide 14 text

• They hire a programmer to help them build a database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)...

Slide 15

Slide 15 text

• They hire a programmer to help them build a database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)... • the programmer writes an upload routine to import data from Excel...

Slide 16

Slide 16 text

• they promise to their colleagues that they will publish the database soon...

Slide 17

Slide 17 text

• they promise to their colleagues that they will publish the database soon... • but the programmer has left the project in order to join Google or Facebook...

Slide 18

Slide 18 text

• they promise to their colleagues that they will publish the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server...

Slide 19

Slide 19 text

• they promise to their colleagues that they will publish the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally...

Slide 20

Slide 20 text

• they promise to their colleagues that they will publish the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it...

Slide 21

Slide 21 text

• they promise to their colleagues that they will publish the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it... • the linguists decide to apply for more funding to finish the database.

Slide 22

Slide 22 text

What do philologists do, if they want to make an analysis involving data?

Slide 23

Slide 23 text

• They hire a programmer to build a database... the programmer creates an online interface, so they can easily insert the data... but they end up using Excel, since the interface is too complicated to be used... and they know how to use Excel anyway (more or less)... the programmer writes an upload routine to import data from Excel...

Slide 24

Slide 24 text

STOP!

Slide 25

Slide 25 text

Problems of Computational Approaches in the Humanities We have

Slide 26

Slide 26 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts...

Slide 27

Slide 27 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts... • who often mistrust each other...

Slide 28

Slide 28 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases...

Slide 29

Slide 29 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications...

Slide 30

Slide 30 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century...

Slide 31

Slide 31 text

Problems of Computational Approaches in the Humanities We have • a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century... • who don’t understand the usefulness of computational methods.

Slide 32

Slide 32 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding

Slide 33

Slide 33 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding • for the specifics of the problems in the humanities...

Slide 34

Slide 34 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems...

Slide 35

Slide 35 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems...

Slide 36

Slide 36 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems...

Slide 37

Slide 37 text

Problems of Computational Approaches in the Humanities Computational scientists often lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems... • and take classical approaches seriously.

Slide 38

Slide 38 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand

Slide 39

Slide 39 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand • that computational approaches do not threaten their jobs...

Slide 40

Slide 40 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far...

Slide 41

Slide 41 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights...

Slide 42

Slide 42 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis...

Slide 43

Slide 43 text

Problems of Computational Approaches in the Humanities Classical scientists often do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis... • by providing practical help in tasks which even classical scientists will consider as repetitive and boring.

Slide 44

Slide 44 text

General Misunderstandings Between the Two Camps Taken from: https://xkcd.com/1831/ Thanks to Matthew Scarborough for sharing.

Slide 45

Slide 45 text

General Misunderstandings What computational scientists misunderstand or ignore: • problems in the humanities can be extremely hard

Slide 46

Slide 46 text

General Misunderstandings What computational scientists misunderstand or ignore: • problems in the humanities can be extremely hard • big data approaches do not necessarily work on small data

Slide 47

Slide 47 text

General Misunderstandings What computational scientists misunderstand or ignore: • problems in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour

Slide 48

Slide 48 text

General Misunderstandings What computational scientists misunderstand or ignore: • problems in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour • being a mathematician does not qualify one automatically to solve problems in the humanities

Slide 49

Slide 49 text

General Misunderstandings What classical scientists misunderstand or ignore: • computational approaches do not exclude qualitative approaches

Slide 50

Slide 50 text

General Misunderstandings What classical scientists misunderstand or ignore: • computational approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection

Slide 51

Slide 51 text

General Misunderstandings What classical scientists misunderstand or ignore: • computational approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities

Slide 52

Slide 52 text

General Misunderstandings What classical scientists misunderstand or ignore: • computational approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities • there is no reason to be proud if one doesn’t understand basic mathematics

Slide 53

Slide 53 text

But how can we integrate the two camps?

Slide 54

Slide 54 text

. . . . By shifting the paradigm!

Slide 55

Slide 55 text

Instead of computer-based vs. classical computer-less approaches, we need a paradigm of computer-assisted approaches as they are already common in biology and other disciplines.

Slide 56

Slide 56 text

Computer-Assisted Approaches in the Humanities Computer-Assisted Language Comparison (List 2017-2022, ERC STG)

Slide 57

Slide 57 text

Computer-Assisted Approaches in the Humanities Computer-Assisted Language Comparison (List 2017-2022, ERC STG)

Slide 58

Slide 58 text

Main Features of CAAH • data must be human- and machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel)

Slide 59

Slide 59 text

Main Features of CAAH • data must be human- and machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data

Slide 60

Slide 60 text

Main Features of CAAH • data must be human- and machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data • software must be adapted to the specific needs of research in the humanities and produce transparent results that can be manually inspected and corrected by the human researchers

Slide 61

Slide 61 text

Examples for CAAH: CALC-Project (List 2017-2022) EDICTOR: Etymological Dictionary Editor (List 2017)

Slide 62

Slide 62 text

Examples for CAAH: CALC-Project (List 2017-2022) Data underlying the EDICTOR

Slide 63

Slide 63 text

Examples for CAAH: CALC-Project (List 2017-2022) LingPy software for sequence comparison in linguistics (List et al. 2017)

Slide 64

Slide 64 text

Examples for CAAH: CALC-Project (List 2017-2022) Cookbook: Recipies for LingPy (List 2016)

Slide 65

Slide 65 text

What We Can Learn From Biologists

Slide 66

Slide 66 text

What We Can Learn From Biologists • foster training in computational basics (command line, shell programming, interfaces, data handling)

Slide 67

Slide 67 text

What We Can Learn From Biologists • foster training in computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data

Slide 68

Slide 68 text

What We Can Learn From Biologists • foster training in computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications

Slide 69

Slide 69 text

What We Can Learn From Biologists • foster training in computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications • foster interdisciplinary teams in which classical and computational scientists collaborate

Slide 70

Slide 70 text

Collaborative Potential for the MM Workshop

Slide 71

Slide 71 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science

Slide 72

Slide 72 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks)

Slide 73

Slide 73 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization)

Slide 74

Slide 74 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line)

Slide 75

Slide 75 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)

Slide 76

Slide 76 text

Collaborative Potential for the MM Workshop • rethink the big data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)

Slide 77

Slide 77 text

http://calc.digling.org Danke für Ihre Aufmerksamkeit!