Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bullcrap people believe about digitization and digital preservation

Dorothea Salo
September 20, 2019

Bullcrap people believe about digitization and digital preservation

For LIS 668 "Digital Curation and Collections"

Dorothea Salo

September 20, 2019

More Decks by Dorothea Salo

Other Decks in Technology


  1. Nonsense people believe
    about digitization
    and digital preservation

    View Slide

  2. Nitwitted cluelessness
    about processes and costs
    is NOT OKAY, Wired.

    View Slide

  3. Here’s Brazil’s reality.
    • The museum had been deeply underfunded for years.

    • It did not have the money to keep its fire equipment
    working. That’s why the fire was so devastating.

    • So tell me, Emily Dreyfuss of Wired: where the {censored}
    {censored} {CENSORED} is a museum that hasn’t even got
    the money for fire prevention supposed to find the money
    to digitize all its holdings?!

    • And even if it did, Emily Dreyfuss of Wired, do you have
    any idea how many decades it would take to digitize an
    entire national museum’s highly heterogeneous
    collections to acceptable quality?

    • (Rhetorical question. Of course she doesn’t; she’s a clueless nit wit. You will
    once you’re through this class, though.)

    View Slide

  4. “Digitization saves everything!
    Digitization roolz, conservation
    • Shut up, Emily Dreyfuss of Wired. Shut up, Daniel Caron.
    Just shut up.

    • I have no patience remaining for this . I don’t think
    the folks I teach (i.e. you) need to have any either.

    View Slide

  5. “Digitization means we don’t
    need analog processing!”
    • Subtext here is always, ALWAYS: “so we can fire all our
    archivists and librarians now, yay!”

    • Shut UP, Daniel Caron. Shut up forever!

    • Digitization requires analog processing. You can’t digitize
    without processing the analog stuff in some way!

    • Does it have to be “exactly the same way we process analog stuff now?” No.
    I’m seeing some interesting digitize-first experimental workflows in archives.
    • That said, processing for digitization is usually more time- and effort-
    intensive than processing for archival access!
    • Also, there is no Magic Backlog Processing Fairy! Not even
    if you digitize!

    View Slide

  6. “Digitization is cheap!”
    • Only true for high-speed single-size document scanning.
    Records managers, you may legitimately get to say this.

    • Academic libraries and archives also do this for low-use, low-artifact-value paper.
    • Everybody else: if you want the results to suck
    unbelievably and you’re okay with a lot of your materials
    getting damaged or destroyed, sure, cheap out.

    • Mind you, this can sometimes be a defensible decision! Low-artifact-value
    bound materials may be “guillotined” for high-speed scanning (mentioned
    above), for example. And “MPLP digitization” is a thing.
    • Especially laughable for A/V and film digitization, which is
    not cheap at aaaaaaaaaaaaall. (We’ll discuss why not later.)

    • High-fidelity 3D digitization (e.g. of realia) ain’t cheap either.

    View Slide

  7. “Digitization is the cheapest
    preservation method!”
    • Leaving aside preservation-by-digitization…

    • (basically, where digitization is the only feasible preservation method)
    • … usually not.

    • For many analog originals, basic conservation is immensely cheaper.
    • For low-use documents, microfilming is immensely cheaper (environmentally as
    well as economically) compared to ongoing digital-preservation costs.
    • This CAN be true, however, for paper records where the paper itself carries little or
    no value and the digitized version is fairly high-use. Hi, records managers!
    • Especially laughable for decently-stable film stock.

    • (Polyester, not nitrate or acetate.) Film digitization is WHOA EXPENSIVE.
    • This is usually someone who thinks there is a Magic
    Digitization Fairy. Getting a quote from a reputable vendor
    (or two or three!) tends to bring them back to earth fast.

    View Slide

  8. “Digitization is unskilled labor!”
    • It’s often treated that way, yes. Read the howls from
    humanists about scan and OCR quality in the Google Books
    project for where that ends.

    • (In Google’s defense, they never once pretended to be doing anything but
    fast-and-dirty scanning, and they’ve repeatedly redone OCR as they improve
    the software.)
    • You get what you pay for.

    • This is especially laughable for A/V and film digitization.
    I’ve been doing A/V for five years and I barely know what
    I’m doing with video and have a lot more to learn.

    • Audio’s not too horribly bad. Video is really, seriously annoying to do right.

    View Slide

  9. “Digitization is
    • Not when the digitization is done cheaply and crappily.

    • Not for realia (including scientific samples, material

    • This should be obvious, but Emily Dreyfuss exists and Wired published her
    clueless crap about digitizing everything in Brazil’s museum, so.
    • It is sometimes possible to digitize in ways that increase the information
    gathered from a material object (e.g. digitally unrolling the Dead Sea Scrolls).
    That takes a lot of work, expense, and experimentation, though, and it's
    hardly the usual case!
    • Basically, any time the physicality/materiality of an
    object matters, a digitized version loses information.

    • Physicality/materiality doesn’t always matter, of course… but especially for
    museum and special-collections holdings, it often will.

    View Slide

  10. “Digitization costs = equipment
    and the people who use it!”
    • I tell you three times, and what I tell you three times is true:

    • There is no magic metadata fairy.
    • There is no magic metadata fairy.
    • If you’re starting from scratch, metadata can cost more (time
    AND money) than digitization.

    • Where this is signally not true: A/V digitization, because audio and video
    digitize in real time. Metadata will typically be lots faster than digitization!
    • “Text encoding” (e.g. TEI) as a method of document
    digitization is also skilled (ergo expensive) work.

    View Slide

  11. “Digitized files won't take
    much storage space!”
    • This is typically someone who doesn’t understand the
    distinction between archival-quality and access formats.

    • Which we’ll get to in a subsequent module, but the basic idea is: high-
    quality digitization into lossless file formats creates very large files.
    • (IT often knows this. “Someone” is probably a boss. Or Emily Dreyfuss.)
    • They also aren’t thinking about backup copies.

    • Do the back-of-envelope math for them.

    • I’ll be showing you file-size calculators.
    • IT is often an ally here — they can laugh uproariously so you
    don’t have to.

    View Slide

  12. “Digital preservation is
    just backups!”
    • We’ll explore this particular fallacy in greater technical
    detail later in the course.

    • For now: NO. JUST NO. Only if you like losing some or all
    of your stuff.

    • Not that backups aren’t important — they are! There’s just more to it than what
    IT typically means by “backups.”

    View Slide

  13. View Slide

  14. Why did I tell you all this?
    Why now?
    • Because these fallacies often cause people — even well-
    meaning people more clueful than Emily Dreyfuss — to
    under-resource digitization and digital preservation.

    • Or to resource digitization without considering digital preservation.
    • In your readings: me calling out a proposal for a Wisconsin-wide digitization
    service on exactly this (among other things). I wasn’t nice about it, but I am
    TIRED of being nice about this nonsense.
    • Bluntly: if your bosses hold these fallacies unchallenged,
    you will not get what you need to do your work right.

    • That's a real sustainability issue, and this is our
    sustainability module, so.

    View Slide

  15. I’ll be posting this lecture
    to my Speakerdeck.
    Feel free to show it to people
    and blame me for it.
    Emily Dreyfuss and Wired
    can bite my shiny metal butt.

    View Slide

  16. Thanks!
    • Copyright 2019 by Dorothea Salo.

    • This lecture and slide deck are licensed under a Creative
    Commons Attribution 4.0 International License.

    View Slide