$30 off During Our Annual Pro Sale. View Details »

Computer-assisted approaches in the humanities

Computer-assisted approaches in the humanities

Talk held at the workshop "Research questions in the humanities as challenges to computer science" (Max Planck Institute for the History of Science, Berlin, 2017/12/06-07).

Johann-Mattis List

December 06, 2017
Tweet

More Decks by Johann-Mattis List

Other Decks in Science

Transcript

  1. Computer-Assisted Approaches in the
    Humanities
    Reconciling Computational and Classical Research
    Johann-Mattis List and Simon Greenhill
    Matchmaking Workshop, December 6/7, 2017, Berlin
    Max Planck Institute for the Science of Human History

    View Slide

  2. What do biologists do, if
    they want to make an
    analysis involving data?

    View Slide

  3. • They download the data from some server...

    View Slide

  4. • They download the data from some server...
    • they rearrange the data according to their
    needs...

    View Slide

  5. • They download the data from some server...
    • they rearrange the data according to their
    needs...
    • using simple command line tools...

    View Slide

  6. • They download the data from some server...
    • they rearrange the data according to their
    needs...
    • using simple command line tools...
    • they run an analysis with some software...

    View Slide

  7. • They download the data from some server...
    • they rearrange the data according to their
    needs...
    • using simple command line tools...
    • they run an analysis with some software...
    • they refine the data manually, if needed...

    View Slide

  8. • They download the data from some server...
    • they rearrange the data according to their
    needs...
    • using simple command line tools...
    • they run an analysis with some software...
    • they refine the data manually, if needed...
    • they write a paper...

    View Slide

  9. • They download the data from some server...
    • they rearrange the data according to their
    needs...
    • using simple command line tools...
    • they run an analysis with some software...
    • they refine the data manually, if needed...
    • they write a paper...
    • and publish it, along with their derived
    dataset.

    View Slide

  10. What do linguists do, if
    they want to make an
    analysis involving data?

    View Slide

  11. • They hire a programmer to help them build
    a database...

    View Slide

  12. • They hire a programmer to help them build
    a database...
    • the programmer creates an online interface,
    so they can easily insert the data...

    View Slide

  13. • They hire a programmer to help them build
    a database...
    • the programmer creates an online interface,
    so they can easily insert the data...
    • but they end up using Excel, since the
    interface is too complicated to be used...

    View Slide

  14. • They hire a programmer to help them build
    a database...
    • the programmer creates an online interface,
    so they can easily insert the data...
    • but they end up using Excel, since the
    interface is too complicated to be used...
    • and they know how to use Excel anyway
    (more or less)...

    View Slide

  15. • They hire a programmer to help them build
    a database...
    • the programmer creates an online interface,
    so they can easily insert the data...
    • but they end up using Excel, since the
    interface is too complicated to be used...
    • and they know how to use Excel anyway
    (more or less)...
    • the programmer writes an upload routine to
    import data from Excel...

    View Slide

  16. • they promise to their colleagues that they
    will publish the database soon...

    View Slide

  17. • they promise to their colleagues that they
    will publish the database soon...
    • but the programmer has left the project in
    order to join Google or Facebook...

    View Slide

  18. • they promise to their colleagues that they
    will publish the database soon...
    • but the programmer has left the project in
    order to join Google or Facebook...
    • the demo version still runs on an old server...

    View Slide

  19. • they promise to their colleagues that they
    will publish the database soon...
    • but the programmer has left the project in
    order to join Google or Facebook...
    • the demo version still runs on an old server...
    • many colleagues know the URL and use the
    data occasionally...

    View Slide

  20. • they promise to their colleagues that they
    will publish the database soon...
    • but the programmer has left the project in
    order to join Google or Facebook...
    • the demo version still runs on an old server...
    • many colleagues know the URL and use the
    data occasionally...
    • but they cannot use it officially, since they
    don’t know how to quote it...

    View Slide

  21. • they promise to their colleagues that they
    will publish the database soon...
    • but the programmer has left the project in
    order to join Google or Facebook...
    • the demo version still runs on an old server...
    • many colleagues know the URL and use the
    data occasionally...
    • but they cannot use it officially, since they
    don’t know how to quote it...
    • the linguists decide to apply for more
    funding to finish the database.

    View Slide

  22. What do philologists do, if
    they want to make an
    analysis involving data?

    View Slide

  23. • They hire a programmer to build a
    database...
    the programmer creates an online interface,
    so they can easily insert the data...
    but they end up using Excel, since the
    interface is too complicated to be used...
    and they know how to use Excel anyway
    (more or less)...
    the programmer writes an upload routine to
    import data from Excel...

    View Slide

  24. STOP!

    View Slide

  25. Problems of Computational Approaches in the Humanities
    We have

    View Slide

  26. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...

    View Slide

  27. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...
    • who often mistrust each other...

    View Slide

  28. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...
    • who often mistrust each other...
    • with classical scientists seeing computational scientists as
    servants to create their databases...

    View Slide

  29. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...
    • who often mistrust each other...
    • with classical scientists seeing computational scientists as
    servants to create their databases...
    • or as people who abuse their data for shiny but senseless
    publications...

    View Slide

  30. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...
    • who often mistrust each other...
    • with classical scientists seeing computational scientists as
    servants to create their databases...
    • or as people who abuse their data for shiny but senseless
    publications...
    • and computational scientists seeing classical scientists as
    stubborn relics from the last century...

    View Slide

  31. Problems of Computational Approaches in the Humanities
    We have
    • a strong divide between computational and classical experts...
    • who often mistrust each other...
    • with classical scientists seeing computational scientists as
    servants to create their databases...
    • or as people who abuse their data for shiny but senseless
    publications...
    • and computational scientists seeing classical scientists as
    stubborn relics from the last century...
    • who don’t understand the usefulness of computational
    methods.

    View Slide

  32. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding

    View Slide

  33. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding
    • for the specifics of the problems in the humanities...

    View Slide

  34. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding
    • for the specifics of the problems in the humanities...
    • where scholars have been working for centuries on their
    individual problems...

    View Slide

  35. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding
    • for the specifics of the problems in the humanities...
    • where scholars have been working for centuries on their
    individual problems...
    • and often have gained great insights into those problems...

    View Slide

  36. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding
    • for the specifics of the problems in the humanities...
    • where scholars have been working for centuries on their
    individual problems...
    • and often have gained great insights into those problems...
    • and rightfully demand that computational scientists respect the
    nature of their problems...

    View Slide

  37. Problems of Computational Approaches in the Humanities
    Computational scientists often lack understanding
    • for the specifics of the problems in the humanities...
    • where scholars have been working for centuries on their
    individual problems...
    • and often have gained great insights into those problems...
    • and rightfully demand that computational scientists respect the
    nature of their problems...
    • and take classical approaches seriously.

    View Slide

  38. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand

    View Slide

  39. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand
    • that computational approaches do not threaten their jobs...

    View Slide

  40. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand
    • that computational approaches do not threaten their jobs...
    • or question the work they have done so far...

    View Slide

  41. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand
    • that computational approaches do not threaten their jobs...
    • or question the work they have done so far...
    • but instead could offer the chance to gain new insights...

    View Slide

  42. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand
    • that computational approaches do not threaten their jobs...
    • or question the work they have done so far...
    • but instead could offer the chance to gain new insights...
    • or to speed up the tedious process of qualitative analysis...

    View Slide

  43. Problems of Computational Approaches in the Humanities
    Classical scientists often do not understand
    • that computational approaches do not threaten their jobs...
    • or question the work they have done so far...
    • but instead could offer the chance to gain new insights...
    • or to speed up the tedious process of qualitative analysis...
    • by providing practical help in tasks which even classical
    scientists will consider as repetitive and boring.

    View Slide

  44. General Misunderstandings Between the Two Camps
    Taken from: https://xkcd.com/1831/
    Thanks to Matthew Scarborough for sharing.

    View Slide

  45. General Misunderstandings
    What computational scientists misunderstand or ignore:
    • problems in the humanities can be extremely hard

    View Slide

  46. General Misunderstandings
    What computational scientists misunderstand or ignore:
    • problems in the humanities can be extremely hard
    • big data approaches do not necessarily work on small data

    View Slide

  47. General Misunderstandings
    What computational scientists misunderstand or ignore:
    • problems in the humanities can be extremely hard
    • big data approaches do not necessarily work on small data
    • black box approaches are satisfying for industry applications
    but not for scientific endeavour

    View Slide

  48. General Misunderstandings
    What computational scientists misunderstand or ignore:
    • problems in the humanities can be extremely hard
    • big data approaches do not necessarily work on small data
    • black box approaches are satisfying for industry applications
    but not for scientific endeavour
    • being a mathematician does not qualify one automatically to
    solve problems in the humanities

    View Slide

  49. General Misunderstandings
    What classical scientists misunderstand or ignore:
    • computational approaches do not exclude qualitative
    approaches

    View Slide

  50. General Misunderstandings
    What classical scientists misunderstand or ignore:
    • computational approaches do not exclude qualitative
    approaches
    • computational solutions can increase the consistency of
    “manual” data inspection

    View Slide

  51. General Misunderstandings
    What classical scientists misunderstand or ignore:
    • computational approaches do not exclude qualitative
    approaches
    • computational solutions can increase the consistency of
    “manual” data inspection
    • computational approaches may provide a fresh perspective on
    long-standing problems in the humanities

    View Slide

  52. General Misunderstandings
    What classical scientists misunderstand or ignore:
    • computational approaches do not exclude qualitative
    approaches
    • computational solutions can increase the consistency of
    “manual” data inspection
    • computational approaches may provide a fresh perspective on
    long-standing problems in the humanities
    • there is no reason to be proud if one doesn’t understand basic
    mathematics

    View Slide

  53. But how can we integrate
    the two camps?

    View Slide

  54. .
    .
    .
    .
    By shifting the paradigm!

    View Slide

  55. Instead of computer-based vs. classical
    computer-less approaches, we need a
    paradigm of computer-assisted approaches as
    they are already common in biology and other
    disciplines.

    View Slide

  56. Computer-Assisted Approaches in the Humanities
    Computer-Assisted Language Comparison (List 2017-2022, ERC STG)

    View Slide

  57. Computer-Assisted Approaches in the Humanities
    Computer-Assisted Language Comparison (List 2017-2022, ERC STG)

    View Slide

  58. Main Features of CAAH
    • data must be human- and machine-readable, scholars should
    never loose the contact to the original data (compare the talk by
    Robert Forkel)

    View Slide

  59. Main Features of CAAH
    • data must be human- and machine-readable, scholars should
    never loose the contact to the original data (compare the talk by
    Robert Forkel)
    • interfaces must be lightweight and not disguise the nature of
    the real data

    View Slide

  60. Main Features of CAAH
    • data must be human- and machine-readable, scholars should
    never loose the contact to the original data (compare the talk by
    Robert Forkel)
    • interfaces must be lightweight and not disguise the nature of
    the real data
    • software must be adapted to the specific needs of research in
    the humanities and produce transparent results that can be
    manually inspected and corrected by the human researchers

    View Slide

  61. Examples for CAAH: CALC-Project (List 2017-2022)
    EDICTOR: Etymological Dictionary Editor (List 2017)

    View Slide

  62. Examples for CAAH: CALC-Project (List 2017-2022)
    Data underlying the EDICTOR

    View Slide

  63. Examples for CAAH: CALC-Project (List 2017-2022)
    LingPy software for sequence comparison in linguistics (List et al. 2017)

    View Slide

  64. Examples for CAAH: CALC-Project (List 2017-2022)
    Cookbook: Recipies for LingPy (List 2016)

    View Slide

  65. What We Can Learn From Biologists

    View Slide

  66. What We Can Learn From Biologists
    • foster training in computational basics (command line, shell
    programming, interfaces, data handling)

    View Slide

  67. What We Can Learn From Biologists
    • foster training in computational basics (command line, shell
    programming, interfaces, data handling)
    • propagate standard formats for data

    View Slide

  68. What We Can Learn From Biologists
    • foster training in computational basics (command line, shell
    programming, interfaces, data handling)
    • propagate standard formats for data
    • offer solutions for collaborative data storage accompanying
    publications

    View Slide

  69. What We Can Learn From Biologists
    • foster training in computational basics (command line, shell
    programming, interfaces, data handling)
    • propagate standard formats for data
    • offer solutions for collaborative data storage accompanying
    publications
    • foster interdisciplinary teams in which classical and
    computational scientists collaborate

    View Slide

  70. Collaborative Potential for the MM Workshop

    View Slide

  71. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science

    View Slide

  72. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science
    • foster a smart application of machine learning tools (no proof
    of concept but actual help for data-pre-processing in
    computer-assisted frameworks)

    View Slide

  73. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science
    • foster a smart application of machine learning tools (no proof
    of concept but actual help for data-pre-processing in
    computer-assisted frameworks)
    • discuss generalizability of workflows for similar tasks (e.g.,
    digitization)

    View Slide

  74. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science
    • foster a smart application of machine learning tools (no proof
    of concept but actual help for data-pre-processing in
    computer-assisted frameworks)
    • discuss generalizability of workflows for similar tasks (e.g.,
    digitization)
    • improve computational training of scholars (with a focus on
    basic tasks like command line)

    View Slide

  75. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science
    • foster a smart application of machine learning tools (no proof
    of concept but actual help for data-pre-processing in
    computer-assisted frameworks)
    • discuss generalizability of workflows for similar tasks (e.g.,
    digitization)
    • improve computational training of scholars (with a focus on
    basic tasks like command line)
    • promote new scientific profiles (scholars who can bridge the gap
    and have training in classical and computational approaches)

    View Slide

  76. Collaborative Potential for the MM Workshop
    • rethink the big data vs. small data problem and the challenge
    for computer science
    • foster a smart application of machine learning tools (no proof
    of concept but actual help for data-pre-processing in
    computer-assisted frameworks)
    • discuss generalizability of workflows for similar tasks (e.g.,
    digitization)
    • improve computational training of scholars (with a focus on
    basic tasks like command line)
    • promote new scientific profiles (scholars who can bridge the gap
    and have training in classical and computational approaches)

    View Slide

  77. http://calc.digling.org
    Danke für Ihre Aufmerksamkeit!

    View Slide