Slide 1

Slide 1 text

Document Liberation Project Trying to Achieve Freedom from Vendor Lock Fridrich Štrba, Software Engineer

Slide 2

Slide 2 text

Whois?  Software Engineer in SUSE Linux Enterprise  Used to work for SUSE on LibreOffice and OpenOffice  Diverse background  FLOSS enthusiast  Working in free time on various projects including LibreOffice  Document Liberation Project

Slide 3

Slide 3 text

The Project

Slide 4

Slide 4 text

History  Launched officially on April 2nd 2014 at 11:00 UTC  First talk given by the founding members 4 hours later  LibreGraphics Meeting 2014 in Leipzig, Germany  Group working on file-formats within LibreOffice since the beginning  GSoC 2011 – import filter for Visio file-formats (libvisio)  During the year of 2012 – import filter for CorelDraw file-formats (libcdr)  GSoC 2012 – import filter for Microsoft Publisher (libmspub)  And more is to come...

Slide 5

Slide 5 text

Beyond LibreOffice itself  Clear feeling that this is bigger then LibreOffice itself  Feedback from conferences  Approached by other projects with a lot of interest  Reuse by other projects  Inkscape  Calligra  Scribus  A service of the LibreOffice community to the wider FOSS world  We receive  We give back

Slide 6

Slide 6 text

Philosophy

Slide 7

Slide 7 text

Ownership of documents  Whose painting is “Independência ou Morte”?  A) The oil paint producer's  B) The canvas producer's  C) Pedro Américo's

Slide 8

Slide 8 text

Ownership of documents We believe that documents and their content belong to their creators, not software vendors

Slide 9

Slide 9 text

Access to documents We believe that access to content you own should not be hindered by the fact that the application that created it is not maintained any more or that the application does not work on the particular operating system that you use

Slide 10

Slide 10 text

Role of open standards We believe that use of truly open and free standards for encoding digital content is the only long-term guarantee that a user's digital content will never be beholden to a single vendor

Slide 11

Slide 11 text

Transitory period We believe that implementation of Free and Open Source Software that can read proprietary file-formats is the best solution to escape vendor lock during the transition period to truly open and free standards

Slide 12

Slide 12 text

Our mission

Slide 13

Slide 13 text

File-format understanding Our mission is to try to understand the structure and details of proprietary, undocumented file-formats

Slide 14

Slide 14 text

FOSS parser implementations Our mission is to use the understanding of the file-formats to implement FOSS libraries that are able to parse such documents and extract as much information as possible from them

Slide 15

Slide 15 text

ODF eco-system Our mission is to use our existing framework to encode this data in a truly free and open standard file-format: the Open Document Format

Slide 16

Slide 16 text

The boring specifics

Slide 17

Slide 17 text

Introspection tools  OLEToy  Introspection of different file-formats  We do NOT produce documentation  Here we encode the file-format knowledge  Colupatr  Hexadecimal editor on steroids  Variable length lines  Scripting support

Slide 18

Slide 18 text

Cool feature everybody envies  Binary diff

Slide 19

Slide 19 text

Software Framework  librevenge  APIs and general-use types  libodfgen  Generators of Open Document files from librevenge APIs  Parser libraries  Libwpd, libwpg, libvisio, libcdr, libmspub, libetonyek,...  Parsing file-format  Processing information  writerperfect  Command-line tools to convert to ODF

Slide 20

Slide 20 text

librevenge::RVNGDrawingInterface virtual void startDocument (const RVNGPropertyList &propList) = 0; virtual void endDocument () = 0; virtual void startGraphics (const RVNGPropertyList &propList) = 0; virtual void endGraphics () = 0; virtual void setStyle (const RVNGPropertyList &propList) = 0; virtual void startLayer (const RVNGPropertyList &propList) = 0; virtual void endLayer () = 0; virtual void startEmbeddedGraphics (const RVNGPropertyList &propList) = 0; virtual void endEmbeddedGraphics () = 0; virtual void drawRectangle (const RVNGPropertyList& propList) = 0; virtual void drawEllipse (const RVNGPropertyList& propList) = 0; virtual void drawPolygon (const RVNGPropertyListVector &vertices) = 0; virtual void drawPolyline (const RVNGPropertyListVector &vertices) = 0; virtual void drawPath (const RVNGPropertyListVector &path) = 0; virtual void drawGraphicObject (const RVNGPropertyList &propList) = 0; virtual void startTextObject (const RVNGPropertyList &propList) = 0; virtual void endTextObject () = 0; virtual void openParagraph (const RVNGPropertyList &propList) = 0; virtual void closeParagraph () = 0; virtual void openSpan (const RVNGPropertyList &propList) = 0; virtual void closeSpan () = 0; virtual void insertText (const RVNGString &str) = 0;  Callback examples

Slide 21

Slide 21 text

librevenge::RVNGTextInterface virtual void startDocument (const RVNGPropertyList &propList) = 0; virtual void endDocument () = 0; virtual void definePageStyle (const RVNGPropertyList &propList) = 0; virtual void openPageSpan (const RVNGPropertyList &propList) = 0; virtual void closePageSpan () = 0; virtual void openHeader (const RVNGPropertyList &propList) = 0; virtual void closeHeader () = 0; virtual void openFooter (const RVNGPropertyList &propList) = 0; virtual void closeFooter () = 0; virtual void defineParagraphStyle (const RVNGPropertyList &propList) = 0; virtual void openParagraph (const RVNGPropertyList &propList) = 0; virtual void closeParagraph () = 0; virtual void defineCharacterStyle (const RVNGPropertyList &propList) = 0; virtual void openSpan (const RVNGPropertyList &propList) = 0; virtual void closeSpan () = 0; virtual void defineSectionStyle (const RVNGPropertyList &propList) = 0; virtual void openSection (const RVNGPropertyList &propList) = 0; virtual void closeSection () = 0; virtual void insertTab () = 0; virtual void insertSpace () = 0; virtual void insertText (const RVNGString &text) = 0; virtual void insertLineBreak () = 0; virtual void insertField (const RVNGPropertyList &propList) = 0; virtual void openOrderedListLevel (const RVNGPropertyList &propList) = 0; virtual void openUnorderedListLevel (const RVNGPropertyList &propList) = 0; virtual void closeOrderedListLevel () = 0; virtual void closeUnorderedListLevel () = 0; virtual void openListElement (const RVNGPropertyList &propList) = 0; virtual void closeListElement () = 0; virtual void openFootnote (const RVNGPropertyList &propList) = 0; virtual void closeFootnote () = 0; virtual void openEndnote (const RVNGPropertyList &propList) = 0; virtual void closeEndnote () = 0;  Callback examples

Slide 22

Slide 22 text

librevenge-stream  RVNGInputStream interface  Virtual interface allowing stream abstraction  Several implementations:  RVNGFileStream  Implementation using file name  RVNGStringStream  Implementation using a buffer of data  RVNGDirectoryStream  Accesses a directory structure as if it was a structured document  OLE2 and ZIP documents handled transparently  No need to know what is the container type  Gives the responsibility to the implementers!

Slide 23

Slide 23 text

librevenge-generators  Useful implementations of the different interfaces  Raw Generators  Implementations of the different librevenge interfaces  printing callbacks called and properties passed  Used for regression testing  CSV generator for spreadsheets, HTML, Text generators  SVG generators  Exception: SVG generator for drawings  Included in librevenge core library  Historical reasons

Slide 24

Slide 24 text

libodfgen  Generators for OpenDocument from librevenge interfaces  OdtGenerator class  Implementations of RVNGDocumentInterface  OdgGenerator class  Implementation of RVNGDrawingInterface  OdpGenerator class  Implementation of RVNGPresentationInterface  OdsGenerator class  Implementation of RVNGSpreadsheetInterface  OdfDocumentHandler interface  SAX-like interface to output XML in a generic way

Slide 25

Slide 25 text

writerperfect  Command-line tools linking the components together  RVNGInputStream implementation  librevenge-stream  Different ODF generators  libodfgen  Different parser libraries  libvisio, libcdr, libmspub, libetonyek, libwpd, libwpg,....  Generates Open Document files  Flat ODF  Package (zipped) ODF

Slide 26

Slide 26 text

Advantage of the design  Parser libraries independent and self-contained  Much easier life of filter writers  Enough to focus on the structure of document to parse  Call the interface callbacks that one needs  Avoid sucking in unrelated libraries  Librevenge itself and libodfgen have only boost as build-time dependency  No need to link text-related libraries in drawing application  Considerable reduction of code duplication  Less risk to have bugs fixed in one place and hanging around in another  Faster to start a library skeleton

Slide 27

Slide 27 text

I am excited! I want to be part of this!

Slide 28

Slide 28 text

Ways to contribute  Code development  Contribute to one of our existing libraries, or  Start a new one  Understanding and documenting file-formats  OLEToy  Preferred way to visualize documents  Need a bit of knowledge of Python  Preparation of sample documents  Need to access a generating application  Important for regression testing

Slide 29

Slide 29 text

New libraries for dummies git clone git://git.code.sf.net/p/libwpd/project-generator cd project-generator/ ./project-generator -h project-generator [] Options -a, -e and -p are required. The project will be created in or in the current directory, if no was given. General options: -h Show this text. Setting project parameters: -a author Set main author of the library. -c importer Set the name of the public importer class. Default is ProjectDocument. -d description Set project description. Default is empty. -e email Set author e-mail. -p project Set the name of the project. -t tool Set base name for conversion tools (e.g., tool2raw). Default is project2*. -y year Set year. Default is current year. Project kind: -D Create a vector drawing importer -P Create a presentation importer -S Create a spreadsheet importer -T Create a text importer. This is the default.

Slide 30

Slide 30 text

Demonstration  [If it does not work, blame it on everything but yourself] ./project-generator -p libfisl -d "FISL15 Document importer library" -a "Fridrich Strba" -e "[email protected]" -T -c FISLDocument -t fisl cd libfisl ./autogen.sh && ./configure --prefix=/usr --libdir=/usr/lib64 --enable-debug --disable-werror make -j4

Slide 31

Slide 31 text

Future file-formats to import?  Google Summer of Code  The possibility for a student to work with outstanding mentors  David Tardon  Fridrich Štrba  Валёк Филиппов  Several formats ready for straight engineering  Apple Numbers, Pages  Adobe PageMaker  Zoner Draw

Slide 32

Slide 32 text

Thank you! www.documentliberation.org @DocLiberation