Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Crowdsourcing Historical Research

Crowdsourcing Historical Research

Presented at Drupal Downunder 2012.

Claudine Chionh

April 19, 2012
Tweet

Other Decks in Research

Transcript

  1. Founders and Survivors • Study of the 73,000 convicts transported

    to Van Diemen's Land (Tasmania) between 1803 and 1853 • Records from the convict system and elsewhere • Health, environment, lifestyle, wellbeing • Effects on health and resilience of descendants http://foundersandsurvivors.org/
  2. Goals of the project • Compile (health and demographic) data

    about this population from a range of sources • Enable other researchers to use this data • Explore quantitative and geographic tools and analyses that are not commonly used in historical research • Combine professional expertise with the enthusiasm of volunteers
  3. Some research projects • Morbidity and mortality on the voyage

    to Australia • Crime and convicts in Tasmania, 1853-1900 • Fertility decline in late C19 Tasmania • Prostitution and female convicts • Tracing convicts' descendants who served in WWI http://foundersandsurvivors.org/research
  4. Who are our users? • Research team • Other interested

    researchers • Genealogists/family historians • Local historians
  5. Data sources • Conduct records • Surgeons' journals • Newspaper

    reports • Births, deaths, marriages • Parish records • Family histories, memories, legends
  6. Official/formal sources Records from the convict system • Trial, conviction

    documents • Conduct records • Ship surgeons' journals • Permissions to marry • Ticket of leave Outside the convict system • Births, deaths, marriages • Later convictions
  7. The Founders and Survivors database • XML (based on Text

    Encoding Initiative http://www.tei-c.org/) • BaseX XML database engine http://basex.org/
  8. Experimenting with Drupal • Used an older version of Migrate

    to import some tabular data as nodes • Problem of scale: 73,000 convicts • XML approach proved to be more efficient
  9. Getting data into our system Formal sources • Collected by

    archives and individual researchers • CSV, Excel, Filemaker, Access ... • Incorporated into BaseX database with Perl scripts Informal sources • Individual convicts' life histories are captured in a Drupal content type ('Community contributed content') • Some sub-projects also capture summary data in Google spreadsheets
  10. Viewing data • Master database in BaseX: presented in XSLT,

    different views for logged in researchers and others • Community contributed content (CCC): Drupal • Two-way link between master database and CCC • Google spreadsheets prepopulated with links to corresponding records in master database
  11. Data capture • Convict biographies captured in Drupal – Community

    Contributed Content (CCC) • Linked to entry in XML database • Perl scripts to incorporate CCC records into master database
  12. Ships (batches of data) • Tracing all convicts on a

    ship • Summary data in Google Spreadsheets • Spreadsheets are prepopulated from the master database
  13. Where Drupal is appropriate for our project • Web frontend

    • Data capture • Collaboration, forums
  14. Summary • Massive XML database with complex relations • Drupal

    for capturing slightly complex data and facilitating collaboration • Google Spreadsheets for capturing tabular data