Slide 1

Slide 1 text

After Cancellation: Reconstructing Knowledge Bases With OpenRefine ALCTS Electronic Resources Interest Group, ALA Annual 2018 Angela Galvan | Electronic Resources Manager | Brown University @panoptigoth | asgalvan.com

Slide 2

Slide 2 text

The first problem • Significant changes to our ScienceDirect agreement. • Loss of institutional memory about serials/database relationships. • ARL privilege = reactive response to cuts. • Lost access to the Freedom Collection, a package of about 2k titles • Me: “Great! I’ll uncheck the box in Serials Solutions for that database.”

Slide 3

Slide 3 text

ARL privilege One time money/end of year funds ‘solve’ problems. Collections are static, not living. eResources are “on or off” without sense of underlying workflow/data.

Slide 4

Slide 4 text

Four goals I like to share things, so it was important to develop a solution with these goals in mind: 1. Should be portable to other libraries with Serials Solutions. 2. Use tools available to most library workers. 3. Free, with well supported communities like OpenRefine, MARCEdit, or similar. 4. Workflow contained within a single department/person.

Slide 5

Slide 5 text

Bespoke, artisanal Serials Solutions workflow • Previous documentation noted ScienceDirect was “heavily customized, use ScienceDirect database and not the smaller Freedom Collection.” • This means staff were previously checking thousands of titles by hand between the knowledge base and the titles list attached to our license, and adding them manually to the ScienceDirect database in Serials Solutions.

Slide 6

Slide 6 text

The second problem • Me: “I’ll deliver a Knowledge Base and Related Tools (KBART) file from Elsevier to Serials Solutions to resolve our ScienceDirect entitlements.”

Slide 7

Slide 7 text

KBART in Serials Solutions KBART uploads “on the roadmap” for Serials Solutions. ProQuest endorsed the KBART standard in 2010. Most of KBART : ODSE (Offline Date and Status Editor) is header changes…except dates.

Slide 8

Slide 8 text

KBART in OpenRefine • Pick an identifier for cell.cross function or use VIB-BIT extension. • Dates by far the biggest issue, because Serials Solutions rejects KBART formatted dates. • value.toString() • value.replace('Jan ', '01/').replace('Feb ', '02/')…. • Change headers to match ODSE

Slide 9

Slide 9 text

Future prevention • Writing documentation as case studies, instead of “how to.” • Surfacing complexities of eResources work whenever possible. • Fully transparent work log. • Eventually: python script.

Slide 10

Slide 10 text

Thank you! • I can’t do an OpenRefine workshop in 15 minutes but you can send me your questions at: [email protected] • Full steps and JSON available at asgalvan.com following #alaac18 • Cassie Schmitt for this post on dates in OpenRefine: https://icantiemyownshoes.wordpress.com/2014/04/24/clean-up- dates-and-openrefine/ • LITA/PLA AvramCamp for supporting my attendance.