Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Custom metadata plugins for Calibre: cataloguing an old paper library by Adrianna Pińska

Pycon ZA
October 12, 2018

Custom metadata plugins for Calibre: cataloguing an old paper library by Adrianna Pińska

Calibre is a cross-platform program for managing an e-book library: organising the books, annotating them with metadata, converting them between different formats and moving them between devices. Its organisation and metadata functionality can also be used to catalogue a collection of paper books.

By default, Calibre fetches its metadata from a few large, popular online sources which focus on recently published English-language books, and often have little to no information about older editions or books in other languages. However, there are many user-created custom metadata plugins which make it possible to integrate Calibre with more specialised book databases. Calibre is written in Python, and so are the plugins!

In this talk I will give an overview of how to find resources to help you start writing your own plugin, and describe how I forked and re-wrote a plugin for downloading metadata from the Internet Speculative Fiction Database.

Pycon ZA

October 12, 2018
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. What is Calibre? https://calibre-ebook.com/ Cross-platform e-book cataloguing program Written in

    Python (2, but with a lot of future imports) Extensible with custom plugins created and shared by users Where the magic happens: https://www.mobileread.com/forums/forumdisplay.php?f=237 (follow links from the Calibre homepage)
  2. How do I use it? Organising my e-books, which is

    what it’s designed for I also want to use it for paper books (metadata-only records are possible)
  3. Reality Lots of my books are from Olden Times and

    predate ISBNs ISBNs don’t unambiguously define a specific edition Calibre smooshes together multiple records found for the same ISBN Calibre’s data sources don’t know about my books
  4. Solutions Use a custom plugin to integrate with a specific

    data source Enter a different type of identifying information Reduce the number of records found to ensure only the correct one is returned
  5. ISFDB http://www.isfdb.org Detailed publication information about science fiction and fantasy

    books Each publication (i.e. specific edition) is uniquely identified by an ISFDB ID There is an API, but it’s very limited – to get the functionality we want, we need to scrape
  6. How to write a plugin, in theory Carefully read the

    entire plugin tutorial and write beautiful, clean, efficient code from scratch
  7. How to actually write a plugin Find a similar plugin

    and hack it to make it do what you want There’s a good chance the author of that plugin did the same thing Turtles all the way down! Promise to fix it later Maybe actually fix it later Tweak it every time the website changes and your scraping breaks
  8. Gluing it all together How to get the ISFDB IDs?

    Use a nasty hack to get them from Firefox’s session after browsing to find your books How to add them all at once? Use a script and the calibredb command How to download the metadata? Disable all other sources to avoid smooshing (and make lookups faster)
  9. THE END Who am I? confluency on Twitter, confluence on

    GitHub Where’s the code? https://github.com/confluence/isfdb2-calibre