Talk held at the workshop "Research questions in the humanities as challenges to computer science" (Max Planck Institute for the History of Science, Berlin, 2017/12/06-07).
rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed...
rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper...
rearrange the data according to their needs... • using simple command line tools... • they run an analysis with some software... • they refine the data manually, if needed... • they write a paper... • and publish it, along with their derived dataset.
database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used...
database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)...
database... • the programmer creates an online interface, so they can easily insert the data... • but they end up using Excel, since the interface is too complicated to be used... • and they know how to use Excel anyway (more or less)... • the programmer writes an upload routine to import data from Excel...
the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server...
the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally...
the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it...
the database soon... • but the programmer has left the project in order to join Google or Facebook... • the demo version still runs on an old server... • many colleagues know the URL and use the data occasionally... • but they cannot use it officially, since they don’t know how to quote it... • the linguists decide to apply for more funding to finish the database.
programmer creates an online interface, so they can easily insert the data... but they end up using Excel, since the interface is too complicated to be used... and they know how to use Excel anyway (more or less)... the programmer writes an upload routine to import data from Excel...
a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases...
a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications...
a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century...
a strong divide between computational and classical experts... • who often mistrust each other... • with classical scientists seeing computational scientists as servants to create their databases... • or as people who abuse their data for shiny but senseless publications... • and computational scientists seeing classical scientists as stubborn relics from the last century... • who don’t understand the usefulness of computational methods.
lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems...
lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems...
lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems...
lack understanding • for the specifics of the problems in the humanities... • where scholars have been working for centuries on their individual problems... • and often have gained great insights into those problems... • and rightfully demand that computational scientists respect the nature of their problems... • and take classical approaches seriously.
do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights...
do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis...
do not understand • that computational approaches do not threaten their jobs... • or question the work they have done so far... • but instead could offer the chance to gain new insights... • or to speed up the tedious process of qualitative analysis... • by providing practical help in tasks which even classical scientists will consider as repetitive and boring.
in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour
in the humanities can be extremely hard • big data approaches do not necessarily work on small data • black box approaches are satisfying for industry applications but not for scientific endeavour • being a mathematician does not qualify one automatically to solve problems in the humanities
approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities
approaches do not exclude qualitative approaches • computational solutions can increase the consistency of “manual” data inspection • computational approaches may provide a fresh perspective on long-standing problems in the humanities • there is no reason to be proud if one doesn’t understand basic mathematics
machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data
machine-readable, scholars should never loose the contact to the original data (compare the talk by Robert Forkel) • interfaces must be lightweight and not disguise the nature of the real data • software must be adapted to the specific needs of research in the humanities and produce transparent results that can be manually inspected and corrected by the human researchers
computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications
computational basics (command line, shell programming, interfaces, data handling) • propagate standard formats for data • offer solutions for collaborative data storage accompanying publications • foster interdisciplinary teams in which classical and computational scientists collaborate
data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks)
data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization)
data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line)
data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)
data vs. small data problem and the challenge for computer science • foster a smart application of machine learning tools (no proof of concept but actual help for data-pre-processing in computer-assisted frameworks) • discuss generalizability of workflows for similar tasks (e.g., digitization) • improve computational training of scholars (with a focus on basic tasks like command line) • promote new scientific profiles (scholars who can bridge the gap and have training in classical and computational approaches)