Upgrade to Pro — share decks privately, control downloads, hide ads and more …

EAD Ingest and Heracles Workflow

dbrower
September 10, 2012

EAD Ingest and Heracles Workflow

5 minute overview of the EAD ingest process into the digital library at the University of Notre Dame. The ingest uses our new workflow system to do the work, which is also described in brief.

Technologies used include ActiveFedora and Resque.

dbrower

September 10, 2012
Tweet

Other Decks in Programming

Transcript

  1. EAD Ingest and Heracles Workflow System Rajesh Balekai Donald Brower

    Digital Library Services Hesburgh Libraries University of Notre Dame Hydra Partners’ Meeting September 2012
  2. EAD Ingest Have many Encoded Archival Description (EAD) files describing

    various special collections. Every night we scan each EAD file and Ingest any new or changed item descriptions Ingest any new or changed images Create a new finding aid document, if needed
  3. Collection Component FindingAid FileAsset (HTML File) Image :has_members :has_parts :is_member_of

    :is_member_of :is_part_of :has_members :has_parts :is_part_of :has_parts :is_part_of FEDORA OBJECT MODEL
  4. EAD Process Flow Chart—Top      

                    
  5. EAD Process Flow Chart—Bottom      

                         
  6. Image Ingest        

              
  7. Heracles We implement the EAD ingestion process on top of

    our new background job system, Heracles. We plan to develop more services to use it: Ingest EAD files and content specified therein Handle digital reformatting of brittle items Automate as much ingest processing as possible Oversee scanning & digitization of workflows Run background integrity checks of stored content
  8. How Each job has a state machine to control execution

    of the job’s tasks Job state and information are stored in a relational database Tasks are idempotent units of execution that do the actual work Distributing the tasks is handled by Resque Worker tasks use ActiveFedora to interface with Fedora and Solr
  9. Open Issues Have not determined best way to automate a

    test of the whole stack Still working out deployment details: Using Jenkins and Capistrano and Resque-pool
  10. Sharing the Code Code will be on GitHub. Two repositories:

    Heracles The workflow state machine and general framework TBD Our worker tasks. Goal is to make tasks reusable between workflows. https://github.com/ndlib/heracles