hardware is HORRENDOUS. •The load of disposing of it is even worse. •Not least because wealthy developed countries dump technotrash on poorer countries. It’s just gross. •Feel guilty when you buy gadgets, please! •“Replacement cycles” should be as long as practical. •Try to buy ﬁxable stuff. (It’s hard! Check out IFixIt, though.) •Other things to consider: •K-12/youth folks: Kids break, lose, and get bored with stuff. Buy carefully. Try before you buy.
•Happens also with web-based services, sometimes •It can make sense, but be aware: •Your organization is now subject to whatever security risks device owners introduce. •This sometimes means that orgs will demand the ability to know what’s on your device (privacy? what privacy?), and even remote-wipe it. •On balance, I don’t think BYOD is a good idea. •If work wants you available by mobile, work should buy it.
in the day: you bought a server, racked it, cooled it, installed everything on it… •It can be more effective (and environmentally friendly due to economies of scale) not to do this. Buy server capacity, not the actual server. •I’ll talk more about Software as a Service in a bit, but I do want to mention some pitfalls to watch out for: • SECURITY. SECURITY SECURITY SECURITY. I don’t even bookmark “somebody had their business database hanging in the breeze on an Amazon server” stories. There’s too many! • Control. • Terms of Service policies (along several axes, but privacy is an important one)
management, acquisitions, etc. •“OPAC” means “Online Public Access Catalog.” Which sounds quaint these days—what other kind of catalog would there be? But you’ll still hear people say it, so. •“Discovery layer:” search tool that bridges the catalog and article-level e-resource databases •ERM(S): E-Resource Management (System) •License handling, catalog-record handling, usage reports, database lists… •“Link resolver:” given a citation/link, get a patron to full text. •“Proxy server:” for off-campus access to e-resources •The vendor needs to know that the patron is entitled to access!
for these that I’m aware! Archivematica is one, though, and so is ArchivesSpace. •Helps with accessioning, ﬁnding-aid construction, patron search and browse, sometimes MARC record construction (where that’s needed) •“Collections management” software •What it sounds like. Keep track of your (physical) stuff!
That Goes On The Website •But can be expanded into e.g. document management, policies/ procedures management, knowledgebases, etc. •“Digital Asset Management System” (DAMS) •Grew out of advertising agencies, but becoming common in a lot of industries that produce a lot of Digital Stuff that needs to be Managed in some way •I mean, the alternative is “people leave stuff lying around on their hard drives and random servers and Dropbox and…” •Digital collections •Usually for showing digitized (sometimes born-digital) content online. •Preservation system •What it says on the box! Designed for ensuring that digital stuff persists into the future. •WARNING: CMSes, DAMS, and digital collections systems are NOT PRESERVATION SYSTEMS, okay?
a huge challenge! •Some systems I’m seeing: •Preservation systems, naturally, but with the added wrinkle that stuff must be deleted on schedule •Email management •“E-discovery:” given all the places information relevant to a given Legal Situation may be lurking, ﬁnd it! •Automated(ish) e-records(ish) transfer(ish) systems •A whole lot of scrambling, ad-hoc roll-your-own “systems”
end-user is using (desktop, laptop, server, tablet, phone, etc) •Firewalls, anti-malware, other defensive tools •“Security Incident and Event Management” systems (SIEM) •Gather up All The Logs from All The Endpoints (and the network) in one place
have to say this, but I have heard it from high-level academic-library administrators who should know better. •I’ve also heard vendors say it. They shouldn’t. •No software “runs itself.” No human process involving software “runs itself.” This is not a thing! Please don’t expect it!
•What does the software need to do? (ASK PEOPLE who interact with the software about this. Don’t just assume.) Don’t forget legal issues such as user-interface accessibility. •What is the maximum load on the software likely to be? (So you can ask whether it will handle that.) •What support and training will your organization need? What is the maximum staff time you can give to installing and supporting this software? •Prioritization: which features are dealbreakers, and which are just nice-to-have •Smart to do: ask around (quietly) for other people’s and agencies’ experiences •Off the record. We’re usually too nice to go on the record with negative experiences.
containing your requirements. •Lets software companies “bid” to serve your needs. •Note that the RFP process heavily disadvantages open-source software, because it assumes every software package has a vendor behind it. This is often a bad thing. •If you don’t have to go through the whole RFP rigmarole, you evaluate software yourself based on the requirements you discovered. •DO NOT evaluate or adopt software without ﬁguring out your requirements ﬁrst! Doing this is the NUMBER ONE reason we end up with bad software. •Be ESPECIALLY careful on conference exhibit ﬂoors!
forced… by RFPs. •Real-world case: NCSU and Endeca • Several innovations: record deduping, LCSH drilldown, relevance ranking of results (!), faceted browsing • But Endeca was not a library vendor! Why not? NCSU couldn’t get library vendors to do what they wanted! • But after everybody oohed and aahed over the Endeca catalog, RFPs started to include Endeca-ish features, so ILS/discovery-layer vendors had to build them in order to be competitive. •Today, ILS vendors: “We’ll do linked data when our customers ask us to.” That means RFPs. •THE RFP PROCESS IS NEARLY THE ONLY TIME YOU HAVE POWER OVER VENDORS. • Use that power wisely and well, okay?
refers to SOFTWARE •Open standard: refers to RULES/SPECIFICATIONS for protocols, ﬁle formats, software, etc. •“Reference implementation:” software that shows how software that complies with a particular standard should work •Open access: refers to SCHOLARLY LITERATURE •Please don’t confuse these. Thanks. •Yeah, yeah, everybody else does. Well, they’re ignorant and you’re not. •And beware of “openwashing!”
humans write for computers to follow •“Compiled code” or “binary code” = source code that has been munged to be directly understandable by the computer •Not interpretable by humans any more! •This is the only form in which proprietary software is distributed (usually), and why you can’t peek under its hood. •“Compiler,” “interpreter,” “virtual machine” all bits and pieces of the source-code to compiled-code transformation. •“API” = program offers “hooks” to hang custom code on. •So you can do things the original developers didn’t envision.
download and install it without paying. •You can (legally) read the code. •You can (legally) change the code. •You can (legally) resell it (sometimes with caveats). •Developers “license” their code under one of a number of open-source licenses •Commonest: GNU General Public License (GPL), which has a resharing sting in its tail •Also notable: BSD license, Artistic License •http://opensource.org/ maintains a vetted list of open-source licenses. If you care. •Overhead: open-source software organizations •Often ask for dues; sometimes sell services
source? •Do you beneﬁt when other people hack on the software? •With open source, quite possibly yes. •If there’s a good API or plugin infrastructure, quite possibly yes. •With API-less proprietary software, rarely and only indirectly. •What happens when a software company goes out of business? Or kills a product? •Proprietary software: decay and obsolescence. •Open-source software: new companies, forks, options. •Security, maybe •Security-through-obscurity doesn’t work. No software is perfectly secure, but OSS has a good track record of fast patches. •However, Heartbleed and Shellshock are worth thinking about. “Many eyes make bugs shallow” only works if the eyes actually exist.
There are tradeoffs. •$$$ vs. staff time/expertise: “free as in kittens” (consider also the cost of supporting an OSS community) •Ease of use/installation vs. control •Professional support vs. ad-hoc online communities •You can’t always know what your experience will be. •Some vendor support is horrible. Some is great. Some online communities are horrible. Some are great. •Some open-source projects move fast. Some don’t. Some vendors move fast. Most don’t (most can’t!). •Only you understand your workplace’s situation. •ASK AROUND before you invest, either way.
be involved in software choice for your employer. •How your software was built affects: •how much you pay for it, up-front and ongoing (“TCO”) •which chunk of budget those costs come from •how much you can do with and to it •how much it will cost to support and train people on it •how much control you have over your data and how your data are presented to your patrons •how good it is •There is no one right answer. There are only tradeoffs, which you need to understand.
own software. Go them! •Some orgs do it by accident! •One bright tinkerer whomps something up. •The library/archives comes to depend on it. •... and then the tinkerer leaves. Oops. •... or the computing world changes such that the whomped-up thing no longer works. Oops. •... or the library/archives misses a chance to adopt a better tool. Oops. •Tinkerers are great. But make them document. And have a plan for transitioning off or supporting the continued development of the whomped-up thing! •This is particularly common in webspace. Make SURE you know what your library’s website is built on.
by for-proﬁt companies •Though small developers and shareware makers are still out there! Especially on mobile! •Certain expectations of performance, stability, polish, documentation •May vary somewhat depending on customer base •May rely on proprietary ﬁle formats for customer lock-in •Less likely these days, but it does still happen. •Pricing: usually “per seat” or “site licensed” •TRACK YOUR SOFTWARE LICENSES. ALL OF THEM. •If you are ever audited, you NEED this documentation.
can’t sell enough “seats” to make money. •... e.g. ILS software for libraries! Also learning-management systems! •Starting to turn up in digital-preservation space. •You pay to run the software AND for a certain level of customer service. •Installation help •Employee training, user groups, conferences •Technical support (up to and including Software as a Service) •You’ll still need local tech staff, often! •Installing and customizing these things is a HASSLE. •The larger your userbase, the more localfolk have to tweak to scale. •Make sure you take localfolk into account when determining TCO. •But there will be strict limits on what you can do.
happens on your vendor’s servers and in your web browser. •Can be a godsend for small organizations with minimal (or uncooperative) IT staff. • Many libraries and archives borrow IT from parent organization; these folks are generally not attuned to info-org-speciﬁc needs. •Can also be a jail cell. Have an exit strategy! • Make SURE you can get your data out! By testing, not by trusting a vendor’s assertion! •Web/CMS, digital-library, IR, ILS software available this way. •I don’t need to tell you to take privacy and security issues seriously, do I? Good. Didn’t think so.
breaks, or needs patching, or whatever, it’s your problem, not the software vendor’s or cloud vendor’s. •But instead of running software on your own server machine, you’re running it on Amazon’s or Google’s or Microsoft’s. •Similar to but slightly different from “server virtualization.” •Question is, who’s readying the server to run your software? With server virtualization, it’s you; with cloud computing, it’s the vendor. •Code often needs some rewriting to run in the cloud. •Can protect a web app from trafﬁc spikes •Can cost more than running one’s own server, though. •Security/privacy questions, too; data is leaving your local space!
acquired/used under any of the previous purchasing models! •You can build it yourself! And quite a few libraries and archives do. •You can buy/download it off the shelf, e.g. some Linux OS distributions, and run it yourself. •Some vendors build or rely on it, e.g. Equinox for Evergreen ILS. •It can appear as Software as a Service, e.g. omeka.net. •It can be run in the cloud. •“We don’t do open source around here” is obtuse obstructionism, also probably untrue. •Unfortunately, that doesn’t stop some people...
library operations around analog materials and patrons. •Archives: you may catalog collections into one, or use one for your circulating materials if any. •“Modules” •Acquisitions •Cataloguing •OPAC •Circulation/patron management •Also (mostly academic libraries): serials, metasearch, e-resource managers (sometimes), link resolvers, ILL... separately or bundled •Underneath: enormous relational database! •Which means ALL THE HEADACHES with MARC data.
(Voyager), Ex Libris (Alma), Sirsi/Dynix (Horizon) •Open-source ILSes •Koha: geared toward individual public libraries •Evergreen: geared toward library consortia, building code for academic libraries (e.g. serials management) •Software as a Service •WorldCat Local •LibraryThing for Libraries •The discovery-layer thing •Primo Central, EBSCO Discovery Service (EDS), Serials Solutions Summon, VuFind, Blacklight •Typical ILS replacement cycle: 5 to 10 years
libraries turned to outside vendors, homegrown solutions •NCSU: contracted with Endeca, who are a web-commerce ﬁrm •UVa: Solr/Flare/Blacklight (ha ha ha) •Scriblio, VuFind, etc. •What were they looking for? •USABILITY! •Faceted searching/browsing •Better associations among records (quasi-FRBRization) •Better correlation between user language and controlled vocabularies •Generally: making the data work harder!
books, maps, sheet music •Title-level journals/magazines/newspapers (“serials”) •Maybe govdocs, theses/dissertations, collection records for stuff in special collections and/or archives •What’s not? •The rest of the information world! Including digital collections, stuff on the web, article-level access to serials, ﬁnding aids... •The information world is bigger than it was! •So is the ILS/OPAC an INVENTORY tool, or a DISCOVERY tool? •If the latter, can we compete with Google? On what basis? •And what is our inventory, really?
search? With all their different interfaces? •Metasearch to the rescue! or something. •Single search interface presented to the user. •Sends user’s query to various databases; receives, processes (deduping, relevance ranking), and presents the results. •Some databases use search protocols like Z39.50 and SRU/SRW. Others have to be screenscraped. •Lousy solution •Slow, not always good at processing results, deduping is chancy, coverage not always the best, advanced-search functions gone.
• Which data sources can you legally build your index from? • Of those, how many have an API to their metadata? Or will you be stuck screenscraping HTML? • Or do you have to work with your link resolver? • Is the metadata any good? Will it play nicely with other metadata? (Hint: Often not!) • Mind you, the software to do this is open-source: Blacklight, Umlaut. The problem is harvesting decent metadata legally. •See also: Google Scholar • Essentially this is what GS does. They make special arrangements to crawl publisher sites, even behind ﬁrewalls.
or ILS add-ins) that purport to offer one-stop shopping: OPAC, digital collections, serials, etc. •Serials Solutions: Summon •WorldCat Local •Ex Libris: Primo Central •EBSCO: EBSCO Discovery Service (EDS) •First question: is this a SEARCH TOOL or a CONTENT/METADATA DATABASE or both? •Next question: coverage? •Players VERY close-mouthed about serials coverage. •For now, this is an academic-library thing.
How do you ﬁnd out if the library has the article among its e-resources? •It may be in multiple databases, full text or not... •OpenURL: protocol for checking citation information against a library’s list of vendor- provided e-journals and article databases •Pack citation info into a URL or a teeny XML document •Link resolver: gizmo that takes in an OpenURL and returns list of available copies. •SFX (Ex Libris) current market leader
2008?” •http://muse.jhu.edu.ezproxy.library.wisc.edu/ cgi-bin/resolve_openurl.cgi? genre=&eissn=1559-0682&issn=0024-2594 &date=2008 •EISSN: International Standard Serial Number, electronic •ISSN: regular ISSN •date •Lots more you can pack in! •Author, article title, journal title •Several of these are string matches, so they fail a lot. (No authority control in this environment yet!)
bundle of journals from a publisher. How do you update holdings and URLs in your OPAC? How do you update your link resolver to know what you have? •How do you keep track of who bought what out of which fund? Or who to call when something breaks? Or usage stats? •Market leader: Serials Solutions •Service (auto-holdings-updating), not just product. •These are starting to be built into ILSes rather than sold separately.
As a ﬁle format, it’s LONG past its sell-by date. • Does not ﬁt into the web universe at all. • Making it work with current-gen technology is a tremendous resource drain. • Decisions made so that MARC could easily output human-readable catalog cards are hurting us badly now that catalog cards aren’t what we want any more, and machines need to understand our data. •That said, libraries have a lot of data in MARC. Many archives do too! • If you become a cataloger, you will be involved in a mass data migration. Have fun! (Believe me, I feel your pain.) •Migration to what? Well, that’s the question. • The answer is probably multiple, which is scary all by itself. But RDA is part of the answer. So is linked data/RDF.
to AACR2 •Does not assume MARC or ISBD underneath! •Diane Hillmann, others actively working on linked-data/RDF expressions. They… sort of work. Hold that thought. •Claimed beneﬁts •Expand the universe of what is describable •Spend less time on rules pilpul, punctuation (ISBD), and other cruft •Less emphasis on “record,” more on linkages •Ability to make our records work with/for outside world •FRBRization
data model for catalog records. •“Relational” as in “relational database.” •Recognizes that not all parts of a bibliographic record describe the same thing •Author: of a “work” •Page count: of an “edition” •“FRBRizing” a catalog means drawing all those relationship arrows between records, and then doing something with them for patrons. •We can do this mechanically. Sort of. Some of it.
its smallest, most computer-digestible parts… in such a way that the data can be published, read, queried, and reused across a global network. •RDF: Resource Description Framework, the major W3C standard underlying linked data. (There are other relevant standards!) •Imagine a single world-spanning database, with data from everywhere. That’s the idea behind linked data. •Relies heavily on globally-unique identiﬁers… •… for people, places, things, concepts… •… often aggregated into “(linked-data) vocabularies.” •Identiﬁers are often (though not quite always) URLs.
a universe of human- readable “strings” of letters and numbers. Human language, in other words. •Computers do not understand human languages. •Yes, yes, natural-language processing. It doesn’t work for library purposes. Hush. •Computers understand “this is a thing, reliably and permanently identiﬁed with this unique identiﬁer.” •To function in a computerized environment, library data needs to IDENTIFY ALL ITS THINGS. •And honestly, given our history of authority control, you’d think we’d be cool with that! Apparently not so much...
copyrightable? •Does it pass copyright’s originality test? •What about a collection of records? Compilation copyright? •Best current guess: transcribed ﬁelds no, other ﬁelds... maybe, compilation... maybe •Contract law with cataloging vendors can limit what libraries do, even with their own records! •What cataloging vendors don’t want: “their” recordbase on the open Web •Disintermediation! •Though the LoC’s records are more or less freely available, so I’m not sure how much I can endorse this argument...
in the US. •But OCLC didn’t author most of the records! •Flap in early 2010s about who can use/remix those records, with or without permission. •Open-records initiatives sprang up in protest •Open Library, Michigan •National Library of Sweden broke its OCLC cataloging contract over this. •“We have to share MARC records with the libraries that depend on us. It’s a lot of our reason for existing!” •To be clear: legal restrictions on reuse and mashups damage librarianship’s presence online. We can’t afford not to settle this.
•This is not entirely a catalog problem. •What about our digitized collections? Born-digital holdings? Finding aids? Usage data? Authority data? •What are our APIs? •To what extent do we NEED local catalogs? •Uncomfortable but necessary question! Do we need to reinvent Google? If so, how do we exchange records for stuff that isn’t in our ILS? •Are we overinvested in the ILS? •How do we facilitate appropriate reuse of our data? Do we/can we bar inappropriate reuse?