Computer-assisted approaches in the humanities

Computer-assisted approaches in the humanities

Talk held at the workshop "Research questions in the humanities as challenges to computer science" (Max Planck Institute for the History of Science, Berlin, 2017/12/06-07).

E01961dd2fbd219a30044ffe27c9fb70?s=128

Johann-Mattis List

December 06, 2017
Tweet

Transcript

  1. Computer-Assisted Approaches in the Humanities Reconciling Computational and Classical Research

    Johann-Mattis List and Simon Greenhill Matchmaking Workshop, December 6/7, 2017, Berlin Max Planck Institute for the Science of Human History
  2. What do biologists do, if they want to make an

    analysis involving data?
  3. • They download the data from some server...

  4. • They download the data from some server... • they

    rearrange the data according to their needs...
  5. • They download the data from some server... • they

    rearrange the data according to their needs... • using simple command line tools...
  6. • They download the data from some server... • they

    rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software...
  7. • They download the data from some server... • they

    rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed...
  8. • They download the data from some server... • they

    rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper...
  9. • They download the data from some server... • they

    rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper... • and publish it, along with their derived dataset.
  10. What do linguists do, if they want to make an

    analysis involving data?
  11. • They hire a programmer to help them build a

    database...
  12. • They hire a programmer to help them build a

    database... • the programmer creates an online interface, so they can easily insert the data...
  13. • They hire a programmer to help them build a

    database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used...
  14. • They hire a programmer to help them build a

    database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)...
  15. • They hire a programmer to help them build a

    database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)... • the programmer writes an upload routine to import data from Excel...
  16. • they promise to their colleagues that they will publish

    the database soon...
  17. • they promise to their colleagues that they will publish

    the database soon... • but the programmer has left the project in order to join Google or Facebook...
  18. • they promise to their colleagues that they will publish

    the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server...
  19. • they promise to their colleagues that they will publish

    the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally...
  20. • they promise to their colleagues that they will publish

    the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it...
  21. • they promise to their colleagues that they will publish

    the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it... • the linguists decide to apply for more funding to finish the database.
  22. What do philologists do, if they want to make an

    analysis involving data?
  23. • They hire a programmer to build a database... the

    programmer creates an online interface, so they can easily insert the data... but they end up using Excel, since the interface is too complicated to be used... and they know how to use Excel anyway (more or less)... the programmer writes an upload routine to import data from Excel...
  24. STOP!

  25. Problems of Computational Approaches in the Humanities We have

  26. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts...
  27. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts... • who often mistrust each other...
  28. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases...
  29. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications...
  30. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century...
  31. Problems of Computational Approaches in the Humanities We have •

    a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century... • who don’t understand the usefulness of computational methods.
  32. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding
  33. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding • for the specifics of the problems in the humanities...
  34. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems...
  35. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems...
  36. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems...
  37. Problems of Computational Approaches in the Humanities Computational scientists often

    lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems... • and take classical approaches seriously.
  38. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand
  39. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand • that computational approaches do not threaten their jobs...
  40. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far...
  41. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights...
  42. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis...
  43. Problems of Computational Approaches in the Humanities Classical scientists often

    do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis... • by providing practical help in tasks which even classical scientists will consider as repetitive and boring.
  44. General Misunderstandings Between the Two Camps Taken from: https://xkcd.com/1831/ Thanks

    to Matthew Scarborough for sharing.
  45. General Misunderstandings What computational scientists misunderstand or ignore: • problems

    in the humanities can be extremely hard
  46. General Misunderstandings What computational scientists misunderstand or ignore: • problems

    in the humanities can be extremely hard • big data approaches do not necessarily work on small data
  47. General Misunderstandings What computational scientists misunderstand or ignore: • problems

    in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour
  48. General Misunderstandings What computational scientists misunderstand or ignore: • problems

    in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour • being a mathematician does not qualify one automatically to solve problems in the humanities
  49. General Misunderstandings What classical scientists misunderstand or ignore: • computational

    approaches do not exclude qualitative approaches
  50. General Misunderstandings What classical scientists misunderstand or ignore: • computational

    approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection
  51. General Misunderstandings What classical scientists misunderstand or ignore: • computational

    approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities
  52. General Misunderstandings What classical scientists misunderstand or ignore: • computational

    approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities • there is no reason to be proud if one doesn’t understand basic mathematics
  53. But how can we integrate the two camps?

  54. . . . . By shifting the paradigm!

  55. Instead of computer-based vs. classical computer-less approaches, we need a

    paradigm of computer-assisted approaches as they are already common in biology and other disciplines.
  56. Computer-Assisted Approaches in the Humanities Computer-Assisted Language Comparison (List 2017-2022,

    ERC STG)
  57. Computer-Assisted Approaches in the Humanities Computer-Assisted Language Comparison (List 2017-2022,

    ERC STG)
  58. Main Features of CAAH • data must be human- and

    machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel)
  59. Main Features of CAAH • data must be human- and

    machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data
  60. Main Features of CAAH • data must be human- and

    machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data • software must be adapted to the specific needs of research in the humanities and produce transparent results that can be manually inspected and corrected by the human researchers
  61. Examples for CAAH: CALC-Project (List 2017-2022) EDICTOR: Etymological Dictionary Editor

    (List 2017)
  62. Examples for CAAH: CALC-Project (List 2017-2022) Data underlying the EDICTOR

  63. Examples for CAAH: CALC-Project (List 2017-2022) LingPy software for sequence

    comparison in linguistics (List et al. 2017)
  64. Examples for CAAH: CALC-Project (List 2017-2022) Cookbook: Recipies for LingPy

    (List 2016)
  65. What We Can Learn From Biologists

  66. What We Can Learn From Biologists • foster training in

    computational basics (command line, shell programming, interfaces, data handling)
  67. What We Can Learn From Biologists • foster training in

    computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data
  68. What We Can Learn From Biologists • foster training in

    computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications
  69. What We Can Learn From Biologists • foster training in

    computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications • foster interdisciplinary teams in which classical and computational scientists collaborate
  70. Collaborative Potential for the MM Workshop

  71. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science
  72. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks)
  73. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization)
  74. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line)
  75. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)
  76. Collaborative Potential for the MM Workshop • rethink the big

    data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)
  77. http://calc.digling.org Danke für Ihre Aufmerksamkeit!