Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bullcrap people believe about digitization and digital preservation

Dorothea Salo
September 20, 2019

Bullcrap people believe about digitization and digital preservation

For LIS 668 "Digital Curation and Collections"

Dorothea Salo

September 20, 2019

More Decks by Dorothea Salo

Other Decks in Technology


  1. Nonsense people believe about digitization and digital preservation

  2. Nitwitted cluelessness about processes and costs is NOT OKAY, Wired.

  3. Here’s Brazil’s reality. • The museum had been deeply underfunded

    for years. • It did not have the money to keep its fire equipment working. That’s why the fire was so devastating. • So tell me, Emily Dreyfuss of Wired: where the {censored} {censored} {CENSORED} is a museum that hasn’t even got the money for fire prevention supposed to find the money to digitize all its holdings?! • And even if it did, Emily Dreyfuss of Wired, do you have any idea how many decades it would take to digitize an entire national museum’s highly heterogeneous collections to acceptable quality? • (Rhetorical question. Of course she doesn’t; she’s a clueless nit wit. You will once you’re through this class, though.)
  4. “Digitization saves everything! Digitization roolz, conservation droolz. DIGITIZE ALL THE

    THINGS!” • Shut up, Emily Dreyfuss of Wired. Shut up, Daniel Caron. Just shut up. • I have no patience remaining for this . I don’t think the folks I teach (i.e. you) need to have any either.
  5. “Digitization means we don’t need analog processing!” • Subtext here

    is always, ALWAYS: “so we can fire all our archivists and librarians now, yay!” • Shut UP, Daniel Caron. Shut up forever! • Digitization requires analog processing. You can’t digitize without processing the analog stuff in some way! • Does it have to be “exactly the same way we process analog stuff now?” No. I’m seeing some interesting digitize-first experimental workflows in archives. • That said, processing for digitization is usually more time- and effort- intensive than processing for archival access! • Also, there is no Magic Backlog Processing Fairy! Not even if you digitize!
  6. “Digitization is cheap!” • Only true for high-speed single-size document

    scanning. Records managers, you may legitimately get to say this. •Academic libraries and archives also do this for low-use, low-artifact-value paper. • Everybody else: if you want the results to suck unbelievably and you’re okay with a lot of your materials getting damaged or destroyed, sure, cheap out. •Mind you, this can sometimes be a defensible decision! Low-artifact-value bound materials may be “guillotined” for high-speed scanning (mentioned above), for example. And “MPLP digitization” is a thing. • Especially laughable for A/V and film digitization, which is not cheap at aaaaaaaaaaaaall. (We’ll discuss why not later.) • High-fidelity 3D digitization (e.g. of realia) ain’t cheap either.
  7. “Digitization is the cheapest preservation method!” • Leaving aside preservation-by-digitization…

    • (basically, where digitization is the only feasible preservation method) • … usually not. • For many analog originals, basic conservation is immensely cheaper. • For low-use documents, microfilming is immensely cheaper (environmentally as well as economically) compared to ongoing digital-preservation costs. • This CAN be true, however, for paper records where the paper itself carries little or no value and the digitized version is fairly high-use. Hi, records managers! • Especially laughable for decently-stable film stock. • (Polyester, not nitrate or acetate.) Film digitization is WHOA EXPENSIVE. • This is usually someone who thinks there is a Magic Digitization Fairy. Getting a quote from a reputable vendor (or two or three!) tends to bring them back to earth fast.
  8. “Digitization is unskilled labor!” • It’s often treated that way,

    yes. Read the howls from humanists about scan and OCR quality in the Google Books project for where that ends. • (In Google’s defense, they never once pretended to be doing anything but fast-and-dirty scanning, and they’ve repeatedly redone OCR as they improve the software.) • You get what you pay for. • This is especially laughable for A/V and film digitization. I’ve been doing A/V for five years and I barely know what I’m doing with video and have a lot more to learn. • Audio’s not too horribly bad. Video is really, seriously annoying to do right.
  9. “Digitization is information-lossless!” • Not when the digitization is done

    cheaply and crappily. • Not for realia (including scientific samples, material artifacts). • This should be obvious, but Emily Dreyfuss exists and Wired published her clueless crap about digitizing everything in Brazil’s museum, so. • It is sometimes possible to digitize in ways that increase the information gathered from a material object (e.g. digitally unrolling the Dead Sea Scrolls). That takes a lot of work, expense, and experimentation, though, and it's hardly the usual case! • Basically, any time the physicality/materiality of an object matters, a digitized version loses information. • Physicality/materiality doesn’t always matter, of course… but especially for museum and special-collections holdings, it often will.
  10. “Digitization costs = equipment and the people who use it!”

    • I tell you three times, and what I tell you three times is true: • There is no magic metadata fairy. • There is no magic metadata fairy. • THERE IS NO MAGIC METADATA FAIRY. • If you’re starting from scratch, metadata can cost more (time AND money) than digitization. • Where this is signally not true: A/V digitization, because audio and video digitize in real time. Metadata will typically be lots faster than digitization! • “Text encoding” (e.g. TEI) as a method of document digitization is also skilled (ergo expensive) work.
  11. “Digitized files won't take much storage space!” • This is

    typically someone who doesn’t understand the distinction between archival-master and access formats. • Which we’ll get to in a subsequent module, but the basic idea is: high- quality digitization into lossless file formats creates very large files. • (IT often knows this. “Someone” is probably a boss. Or Emily Dreyfuss.) • They also aren’t thinking about backup copies. • Do the back-of-envelope math for them. • I’ll be showing you file-size calculators. • IT is often an ally here — they can laugh uproariously so you don’t have to.
  12. “Digital preservation is just backups!” • We’ll explore this particular

    fallacy in greater technical detail later in the course. • For now: NO. JUST NO. Only if you like losing some or all of your stuff. • Not that backups aren’t important — they are! There’s just more to it than what IT typically means by “backups.”
  13. None
  14. Why did I tell you all this? Why now? •

    Because these fallacies often cause people — even well- meaning people more clueful than Emily Dreyfuss — to under-resource digitization and digital preservation. • Or to resource digitization without considering digital preservation. • In your readings: me calling out a proposal for a Wisconsin-wide digitization service on exactly this (among other things). I wasn’t nice about it, but I am TIRED of being nice about this nonsense. • Bluntly: if your bosses hold these fallacies unchallenged, you will not get what you need to do your work right. • That's a real sustainability issue, and this is our sustainability module, so.
  15. I’ll be posting this lecture to my Speakerdeck. Feel free

    to show it to people and blame me for it. Emily Dreyfuss and Wired can bite my shiny metal butt.
  16. Thanks! • Copyright 2019 by Dorothea Salo. • This lecture

    and slide deck are licensed under a Creative Commons Attribution 4.0 International License.