an amalgam of ✦ course slidedecks ✦ earlier talks ✦ a forthcoming article ✦ ongoing research (datadoubles.org, and for clarity, I do not represent this project or its other investigators today) ✦ It won’t hold together as well as I like my talks to do. It certainly doesn’t have pretty slides! ✦ I’m sorry. I ask for and appreciate your patience. ✦ Silver lining: I don’t mind tangents! They can’t interrupt a ﬂow that doesn’t exist! So ask all the questions you like whenever you like.
a learner in my Information Security and Privacy course. I was originally asked to catalog privacy dangers and demonstrate threat models. ✦ I don’t want to do that right now, though. I’m raw and tired, and I know I’m not the only one. ✦ Recommended, if you want this: Morrone et al’s https:// dataprivacyproject.org/learning-modules/risk-assessment/ ✦ So, instead, here’s my plan: ✦ Foundations: why privacy in libraries? ✦ Situation report: what are today’s threats to library privacy speciﬁcally? (spoiler: there are lots!) ✦ Blameless post-mortem: how did we let this happen? ✦ Testing a heuristic: “physical-equivalent privacy.” How can we think diﬀerently so that this stops happening?
a privacy-themed issue of Serials Review. I don’t know exactly when. ✦ I can’t make it open-access until publication. Honestly, I’m chewing my ﬁngernails about that. But as soon as it goes live, I’ll put my accepted manuscript in MINDS@UW. ✦ I also have no room to criticize the publication schedule, because I turned in my manuscript a month late! (Love you, SR editors!) ✦ But if you want a preview (beyond this talk)… ✦ … go look at the slides from my NASIG 2015 keynote, especially the slide about video surveillance, because that’s where the idea began. ✦ https://speakerdeck.com/dsalo/aint-nobodys-business-if-i-do-read- serials-with-notes
of personal data, and conﬁdentiality in the relationship between the user and library…” ✦ https://www.iﬂa.org/publications/node/10056 and it’s excellent, the best and most situationally-aware document libraries have ✦ ALA: “We protect each library user's right to privacy and conﬁdentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” ✦ ACRL: “The privacy of library users is and must be inviolable.”
deontological. Here Are Your Principles, Go Forth And Observe Them. ✦ Fine as far as it goes… but doesn’t explain why why WHY these are the principles! ✦ Much less how to operationalize them. (Which, fair: operationalization changes constantly, but ethics codes shouldn’t.) ✦ Or what to do when principles collide. Which principle wins? ✦ I mention this because in my estimation, privacy has been taking a back seat to several other principles lately. I don’t approve. ✦ Allows empty lip service.
✦ Steve Witt. “The Evolution of Privacy within the American Library Association, 1906–2002.” Library Trends 65:4 (2017). ✦ My next ﬁve slides derive entirely from this piece. ✦ Turns out to be pragmatic consequentialism: without privacy, patrons got in trouble… and so did libraries. ✦ 1906: Immigrant Henry Melnek, suspected of anarchism, arrested. Chief Librarian helped with the arrest, even testiﬁed against Melnek in court, disclosing his library information habits! ✦ Russian czarist agents were also involved (weird echoes today, right?). And a newspaper called libraries “schools of anarchism” for having anarchist materials available. Criticism of libraries went on for years!
ﬁles it has a valuable selected list of names and addresses which may be of service in various ways either as a MAILING-LIST or as a DIRECTORY. ✦ “Probably there are no two opinions regarding the impropriety of allowing the list to be used for COMMERCIAL PURPOSES along either line. ✦ (Me, today: … really? I wish there weren’t!) ✦ “The use as a directory may occasionally be legitimate and is allowable after investigation and report to someone in authority. ✦ (Me, today: really? when? what investigation? which authorities?)
library registration lists ✦ by the police, to ﬁnd a fugitive from justice; ✦ by private detectives, ostensibly on the same errand; ✦ by a wife, looking for her runaway husband; ✦ by persons searching for lost relatives; ✦ and by creditors on the trail of debtors in hiding. ✦ (Take a moment. How many of these scenarios matter today? Which do you trust? Not trust?) ✦ (Deﬁnitely notice Bostwick’s “ostensibly.” Today I’d extend this to the other points too! People and organizations LIE OFTEN and CHANGE THEIR STORIES about why they want data and how they use it!)
obedience to an order of court, it is not only unjust, but ENTIRELY INEXPEDIENT from the library’s standpoint to betray to anyone a user’s whereabouts against that user's wishes or even where there is a mere possibility of his objection. ✦ (Me, today: just whereabouts? much more is knowable!) ✦ “If it were clearly understood that such consequences might follow the holding of a library card, we should doubtless LOSE MANY READERS that we especially desire to attract and hold.” ✦ (Me, today: Is this still true? I believe it is, but I don’t have an all- encompassing answer. That’s part of why I signed on to Data Doubles.)
Great Depression, and librarian labor was suﬀering. ✦ Response: demonstrate that not just anybody could be a librarian! ✦ This gets deeper into questions of how professions work than I want to get, fascinating though I ﬁnd labor history. ✦ But ethics codes were deﬁnitely a step toward professioning up. ✦ I mention this because protecting one another as workers is also deeply salient today! Can we use privacy as something that sets us apart? ✦ To do so, we’d have to be actively protecting it, of course! It won’t help us to trumpet promises we aren’t keeping.
drafting the Code was not a one-and-done thing. Editing by committee! ✦ Privacy and conﬁdentiality all but disappeared from some drafts. ✦ There was debate within the profession over privacy! Many librarians believed turning in anarchists was the right thing to do, for example. ✦ Relevant to today? Yes, absolutely. ✦ Privacy versus security ✦ Privacy versus “customer relationship management” ✦ Privacy versus assessment and analytics ✦ Privacy versus improved (?) service ✦ I’m hardcore about this: PRIVACY SHOULD WIN, hands down and without question. But not every librarian today is me!
say that. (this will be very incomplete; see also the work of e.g. DLF Privacy and Ethics in Technology group, Digital Shred, Alison Macrina/Library Freedom Project, Yasmeen Shorish, Sarah Lamdan, Scott Young, Heather Shipman, Melissa Morrone, Kyle M. L. Jones and collaborators, and so many, many more)
all library websites and services over HTTPS, not HTTP. ✦ Prefer wired to wiﬁ access on in-library patron and staﬀ machines. Secure wiﬁ as best we can. ✦ What have we done about this? ✦ Breeding 2018: 7.9% of academic libraries and 18.3% of public libraries serve HTTP websites, not HTTPS. WE ARE BEHIND. ✦ Wiﬁ protection in libraries: no systematic investigation I know of, so we don’t know much, but I’m not sanguine. ✦ (It doesn’t help that wiﬁ protocols leak privacy like sieves presently. This will change, but not as quickly as I’d like.)
our in-library browsers away from Google toward DuckDuckGo or Qwant or searX. ✦ Stop using other Google services, especially YouTube and Google Analytics (use Matomo or another privacy-aware alternative instead). ✦ Dump Facebook. (At least stop advertising it!) ✦ Educate and advocate. ✦ What are we doing about this? Nothing.
tracker-blockers in browsers on in-library machines. ✦ Refuse facial recognition and other biometrics outright. ✦ Academic libraries: refuse ID-card tracking outright. ✦ Refuse the Internet of Things outright. It’s not secure! It’s not private! ✦ Educate and advocate. ✦ What are we doing about this? Nothing. ✦ (with the exception of a few — too few! — advocates and educators)
# library databases accessed # academic journals accessed Appointments with peer tutors Chat reference transactions Interlibrary loan transactions One LA project, identified (!) data on all undergraduates: # of classes attended with library instruction
very, very clear about what “conﬁdential” means. I see too many librarians extending it past all sense: “patron data are still conﬁdential because I decided they could have it!” for many values of “they.” ✦ (Several privacy interpretations of library ethics codes fall into this trap. I’d like to see that ﬁxed. Simple heuristic, for starters: if the data’s seeing use outside the library, IT AIN’T CONFIDENTIAL!) ✦ Train our people better. All our people. It’s not enough for me to yell at my students (though I do!). Not all library employees have ALA- accredited degrees, and “not having the degree” is no excuse for this. ✦ Stop letting unethical patron-data use in research, both internal and for publication, slide by. ✦ Refuse to add patron data to campus or municipal data warehouses. ✦ What are we doing about this? Not half enough.
License terms. Model licenses, model license language. ✦ Stop letting NISO write these! Stop letting NISO say it speaks for libraries! NISO is not a library organization; it is also underwritten by vendors. This is an inherent, structural conﬂict of interest. ✦ Audit vendors. They have to do accessibility VPATs; why don’t we have a privacy analogue to VPATs? ✦ Educate and advocate. ✦ What are we doing? Nothing.
obviously biased) ✦ Me: *gives keynote at MnLA Annual 2019* ✦ Me: *brings evidence of poor privacy practices in speciﬁc libraries/ consortia in Minnesota* ✦ Keynote: *goes over like lead balloon* (they can’t all be winners) ✦ WiLS: “Hey, Dorothea, favor? Would you give this talk as a webinar for us?” Me: “Sure.” ✦ A Minnesota librarian: “Hey, WiLS, Dorothea brought evidence! It was awful!” ✦ WiLS: “Hey, Dorothea… no evidence from speciﬁc libraries/consortia in your webinar, plzkthx.” ✦ Me: “I withdraw the webinar.” ✦ Me: *posts slides to SpeakerDeck anyway, because why not*
literal, actual DECADES to ﬁgure out privacy around physical libraries and materials. ✦ We’re not even done ﬁguring it out yet! Though we have a (curiously implicit, often) shared understanding of best practices. ✦ No surprise we haven’t ﬁgured it out for online yet. It’s a lot to get our heads around! ✦ That said, I could wish we’d put a lot more eﬀort toward it, as a profession… but that’s water under the bridge. ✦ I have an idea about how to make it more tractable. Hold that thought; I’ll get to it.
“Dark [design] patterns:” underlie a lot of privacy dangers, online and oﬀ-, in and outside libraries. ✦ Intentionally misleading/deceptive/untransparent design choices ✦ Secrecy and outright lies from Big Tech ✦ Secrecy and outright lies from Big Data pushers ✦ Secrecy and outright lies from Big Content ✦ among whom I count many library content and service vendors ✦ Secrecy and outright lies from government agencies ✦ It’s a complicated environment! Transparency would sure help!
Becky Yoose, in addition to folks I’ve previously mentioned. ✦ LDH Consulting Services: https://ldhconsultingservices.com/ ✦ I’m trying. So are Alison Macrina, Digital Shred, Melissa Morrone, ALA OIF/Erin Berman, DLF… ✦ But the intersection of privacy, technology, and libraries is hideously complicated. “Expert” is a legitimately hard place to reach! ✦ I’m not sure I’m there, and I both research and teach this stuﬀ! ✦ I do know I can’t get somebody there in the fourteen weeks of a three-credit no-tech-prereqs course. Don’t come at me with “it’s all LIS education’s fault!” You will not like my answer.
feel this especially hard as an educator right now. The situation with pandemic exam proctoring is just appalling. ✦ All praise to Z Smith Reynolds Library at Wake Forest University! ✦ Real thing I heard from a real librarian once about patron-data analytics: “Finally I can speak to my administrators in language they understand!” ✦ The environments libraries exist in do not usually share or even understand library ethics! ✦ The people and services libraries rely on (IT, vendors, standards bodies) do not usually share or even understand library ethics!
they come from a place of (real, justiﬁed) fear. ✦ We are afraid of being disintermediated, erased and made invisible… and let’s be blunt: ﬁred. ✦ We’re grasping at anything and everything to prevent that… and surveillance / data analysis is hot right now. ✦ This is one place clash of deontological principles turns up. ✦ Accountability is also a principle we believe in! What happens when that appears to mean compromising on privacy?
that can be a trap. ✦ Deontological principle clash, again! ✦ (with an apologetic nod to Scott Young, who points out that “service” is not actually an ethical principle, but a practice) ✦ If we posit that surveilling patron behavior and analyzing patron data are the best/only ways to learn how to serve them… how do we decide not to do that? ✦ Now, that’s a really big “if” there — I don’t actually believe it for an instant! The evidence base for service interventions based on surveillance and Big Data is absolutely ABYSMAL. ✦ But that still leaves “if it DOES work, does that mean we should?”
✦ very, very “about us without us” (RA21: zero librarians until the comment stage. Seamless Access: tokenized librarians) ✦ very, very dangerous (to more than privacy!) ✦ some very, very untrustworthy people and organizations involved ✦ the Sci-Hub wars ✦ I do not like what I see out of this SNSI thing. ✦ CRM: OrangeBoy, OCLC WISE, Gale Analytics… ✦ Open access —> patron data exploitation ✦ Sam Popowich has a devastating piece on this. Recommended. ✦ https://journals.library.ualberta.ca/jcie/index.php/JCIE/article/view/ 29410
out-of-sight, out-of-mind… unlike (most) physical privacy dangers. ✦ Libraries have fairly solid best practices around the privacy of using information in physical carriers. ✦ I’m not claiming perfection! I’m claiming thought and procedure. ✦ So… maybe it makes sense to ﬁgure out what the physical analogue to online patron-data capture/ storage/use looks like? ✦ To make it easier to evaluate whether we’re okay with it?
considered PHYSICAL-EQUIVALENT only when a patron using an information-equivalent physical resource would enjoy no more privacy than the same patron using the e-resource. ✦ (The distinction is really online/oﬄine, not physical/digital. I know this, okay? I wanted the alliteration. Nitpickers step oﬀ, please.)
waterproof, or free of weird edge cases. It’s not! ✦ That’s okay, though. I’m not trying for that! ✦ In my Twitter bio: “Ethicists are scalpels. I am a buster sword.” ✦ I’m trying for a quick-and-dirty thought process (based on long-standing, time-tested practices) that librarians can use as a handy yardstick. ✦ Term of art for this, from psychology and neuroscience: “HEURISTIC.”
data is captured/ stored/analyzed/used/shared/sold around a given online information use. ✦ This is deﬁnitely the hard part, not least because of all the secrecy and lies around it. ✦ I suggest methods in my forthcoming article, but for today’s exercises I’ll just be giving you this up-front! ✦ Step 2: What would have to happen for this amount of data to be captured (etc.) about a patron using an analogous physical object? ✦ Step 3: Is that scenario okay? If not, the analogous online scenario probably isn’t either.
✦ Adobe 2014 ✦ University of Minnesota learning analytics ✦ Which I called all the way out in the aforementioned keynote. ✦ Was I right? Was I wrong? You make the call. ✦ (I’ve been wrong before. I think I’m also Data Doubles’s biggest privacy hawk; even my co-investigators don’t always agree with me!)
Wireshark) on the same local network: ✦ Full content of all OPAC pages browsed, including search-results pages and individual-item pages ✦ All URLs browsed (this is actually true of securely-served OPACs too! it makes me rethink OPAC item permalinks…) ✦ All search terms entered into search forms (or in URL query strings, which frankly no library web tool should be using in 2020) ✦ All items requested via holds, delivery, or save-this-for-later features ✦ Easily traceable to the device being used (including devices belonging to and used by only one patron, like a phone). ✦ Okay. Capture this amount of info about a patron browsing the card catalog and library shelves. Go!
for library ebooks. ✦ In 2014, caught sending the following user information across the Internet, sniﬀ-vulnerable: ✦ user and device identiﬁers ✦ each ebook accessed ✦ length of time spent reading the ebook ✦ percentage of ebook read ✦ exact pages viewed ✦ Capture this information about a patron reading a physical book. Leak the info equally broadly. ✦ Wherever the patron does the reading! In-library or out of it!
Editions and Adobe servers. ✦ No more sniﬃng! ✦ That’s it. ✦ As far as we know, they’re still collecting the data. ✦ We still don’t know what they did or are doing with it. ✦ Did I mention that Adobe is a major data broker? ✦ And an Adobe partner/subsidiary (Mobilewalla) published a report geolocating and tracking George Floyd protesters?
data points I had up earlier? It was from… ✦ UMinnesota’s library learning analytics project. ✦ I based the list on their published public publications! No inside intel! ✦ They did not notify students. There was no opt- out, much less actual informed consent. ✦ The library-use data was combined with identiﬁed demographic, GPA, transcript, and other university data. ✦ And in C&RL, some of the published statistics are for very low-n populations, raising the chances of individual reidentiﬁcation. (I’m pretty sure I could do it, and I’m not experienced at reidentiﬁcation.) ✦ C&RL was told of this and chose to do nothing. NOT OKAY, C&RL.