Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing Import Filters for LibreOffice: Diminishing the Number of Reinvented Wheels

Writing Import Filters for LibreOffice: Diminishing the Number of Reinvented Wheels

Fridrich Strba

October 11, 2011
Tweet

More Decks by Fridrich Strba

Other Decks in Technology

Transcript

  1. 1 LibreOffice Paris 2011 Conference Writing Import Filters for LibreOffice:

    Diminishing the Number of Reinvented Wheels Fridrich Štrba SUSE LibreOffice Team
  2. 2 LibreOffice Paris 2011 Conference Agenda Why would anybody care

    about writing import filters What interfaces are available for filter developers The nice and cool way to write an import filter without actually having to understand LibreOffice
  3. 4 LibreOffice Paris 2011 Conference Legacy formats out there ODF

    is the future of the humanity Nevertheless, humanity does not know about it as of now Other de facto standards Some people use other Office Suites :( Hard-disks full of teenage poetry in funny formats LibreOffice offers the freedom to read that crap
  4. 5 LibreOffice Paris 2011 Conference Pure intellectual exercise Allows to

    program for LibreOffice without having to understand its internals Pretty stand-alone functionality communicating with LibreOffice over well defined interfaces … … almost Happy users will reward you You will be the hero of the people who can now read their documents... … and they will get on your nerves listing features that are not converted.
  5. 7 LibreOffice Paris 2011 Conference XSLT filters The easiest way

    to write an import filter from other xml format Possibility to add using user interface … … and export as an extension Used services com.sun.star.comp.Writer.XMLOasisImporter XSLT filter that pushes to LibreOffice flat ODT com.sun.star.comp.Calc.XMLOasisExporter XSLT filter that receives from LibreOffice flat ODS Reasonably fast in LibreOffice Someone rewrote it from Java to C++
  6. 11 LibreOffice Paris 2011 Conference Filter description <node oor:name="OpenDocument Text

    Flat XML" oor:op="replace"> <prop oor:name="FileFormatVersion"> <value>0</value> </prop> <prop oor:name="Type"> <value>writer_ODT_FlatXML</value> </prop> <prop oor:name="DocumentService"> <value>com.sun.star.text.TextDocument</value> </prop> <prop oor:name="UIComponent" /> <prop oor:name="UserData"> <value oor:separator=","> com.sun.star.documentconversion.XSLTFilter,, com.sun.star.comp.Writer.XMLOasisImporter,com.sun.star.comp.Writer.XMLOasisExporter, ../share/xslt/odfflatxml/odfflatxmlimport.xsl, ../share/xslt/odfflatxml/odfflatxmlexport.xsl </value> </prop> <prop oor:name="FilterService"> <value>com.sun.star.comp.Writer.XmlFilterAdaptor</value> </prop> <prop oor:name="TemplateName" /> <prop oor:name="UIName"> <value>OpenDocument Text (Flat XML)</value> </prop> <prop oor:name="Flags"> <value>IMPORT EXPORT OWN 3RDPARTYFILTER</value> </prop> </node>
  7. 12 LibreOffice Paris 2011 Conference Type description <node oor:name="writer_ODT_FlatXML" oor:op="replace">

    <prop oor:name="DetectService"> <value>com.sun.star.comp.filters.XMLFilterDetect</value> </prop> <prop oor:name="URLPattern" /> <prop oor:name="Extensions"> <value>fodt odt xml</value> </prop> <prop oor:name="MediaType" /> <prop oor:name="Preferred"> <value>false</value> </prop> <prop oor:name="PreferredFilter"> <value>OpenDocument Text Flat XML</value> </prop> <prop oor:name="UIName"> <value>OpenDocument Text (Flat XML)</value> </prop> <prop oor:name="ClipboardFormat"> <value> doctype:office:mimetype="application/vnd.oasis.opendocument.text"</value> </prop> </node>
  8. 15 LibreOffice Paris 2011 Conference Interfaces com.sun.star.document.ImportFilter XImporter setTargetDocument XFilter

    filter cancel com.sun.star.document.ExportFilter XExporter setSourceDocument XFilter
  9. 16 LibreOffice Paris 2011 Conference Document representation com::sun::star::xml::sax::XDocumentHandler startDocument endDocument

    startElement endElement characters ignorableWhitespace processingInstruction setDocumentLocator
  10. 17 LibreOffice Paris 2011 Conference File information com::sun::star::document::MediaDescriptor Represented by

    com::sun::star::beans::PropertyValue Basically XML-ish structure with property names and their values com::sun::star::io::XOutputStream OutputStream A stream to receive the document data (for export filters) com::sun::star::io::XInputStream InputStream Content of document string URL URL of the document Create the stream if it does not exist Not necessary now because the XOutputStream or XinputStream are guaranteed
  11. 18 LibreOffice Paris 2011 Conference Type detection com::sun::star::document::XExtendedFilterDetection string detect(

    sequence< ::com::sun::star::beans::PropertyValue > Descriptor) Returns any valid type name (which specifies the detected format) or an empty value for unknown formats. Adds the information into the Descriptor
  12. 20 LibreOffice Paris 2011 Conference History Import filter using XFilter

    framework Originally just a module for WordPerfect importer Microsoft Works importer Need to make a bit more generic Initial rewrite / adaptation WordPerfect Graphics import The same framework for graphic files Visio Import filter Multipage graphic documents
  13. 21 LibreOffice Paris 2011 Conference Advantages Generating ODF is not

    trivial Settings, Styles, Automatic styles, Content Provide a linear interface Reuse instead of copy the existing ODF generators Developer does not waste time designing interface Speeds up development by focusing on the essentials
  14. 22 LibreOffice Paris 2011 Conference Text Document Representation class WPXDocumentInterface

    { public: virtual ~WPXDocumentInterface () {} virtual void setDocumentMetaData (const WPXPropertyList &propList) = 0; virtual void startDocument () = 0; virtual void endDocument () = 0; virtual void definePageStyle (const WPXPropertyList &propList) = 0; virtual void openPageSpan (const WPXPropertyList &propList) = 0; virtual void closePageSpan () = 0; virtual void openHeader (const WPXPropertyList &propList) = 0; virtual void closeHeader () = 0; virtual void openFooter (const WPXPropertyList &propList) = 0; virtual void closeFooter () = 0; virtual void defineParagraphStyle (const WPXPropertyList &propList, const WPXPropertyListVector &tabStops) = 0; virtual void openParagraph (const WPXPropertyList &propList, const WPXPropertyListVector &tabStops) = 0; virtual void closeParagraph () = 0; virtual void defineCharacterStyle (const WPXPropertyList &propList) = 0; virtual void openSpan (const WPXPropertyList &propList) = 0; virtual void closeSpan () = 0; virtual void defineSectionStyle (const WPXPropertyList &propList, const WPXPropertyListVector &columns) = 0; virtual void openSection (const WPXPropertyList &propList, const WPXPropertyListVector &columns) = 0; virtual void closeSection () = 0; virtual void insertTab () = 0; virtual void insertSpace () = 0; virtual void insertText (const WPXString &text) = 0; virtual void insertLineBreak () = 0; virtual void insertField (const WPXString &type, const WPXPropertyList &propList) = 0; virtual void defineOrderedListLevel (const WPXPropertyList &propList) = 0; virtual void defineUnorderedListLevel (const WPXPropertyList &propList) = 0; virtual void openOrderedListLevel (const WPXPropertyList &propList) = 0; virtual void openUnorderedListLevel (const WPXPropertyList &propList) = 0; virtual void closeOrderedListLevel () = 0; virtual void closeUnorderedListLevel () = 0; virtual void openListElement (const WPXPropertyList &propList, const WPXPropertyListVector &tabStops) = 0; virtual void closeListElement () = 0; virtual void openFootnote (const WPXPropertyList &propList) = 0; virtual void closeFootnote () = 0; virtual void openEndnote (const WPXPropertyList &propList) = 0; virtual void closeEndnote () = 0; virtual void openComment (const WPXPropertyList &propList) = 0; virtual void closeComment () = 0; virtual void openTextBox (const WPXPropertyList &propList) = 0; virtual void closeTextBox () = 0; virtual void openTable (const WPXPropertyList &propList, const WPXPropertyListVector &columns) = 0; virtual void openTableRow (const WPXPropertyList &propList) = 0; virtual void closeTableRow () = 0; virtual void openTableCell (const WPXPropertyList &propList) = 0; virtual void closeTableCell () = 0; virtual void insertCoveredTableCell (const WPXPropertyList &propList) = 0; virtual void closeTable () = 0; virtual void openFrame (const WPXPropertyList &propList) = 0; virtual void closeFrame () = 0; virtual void insertBinaryObject (const WPXPropertyList &propList, const WPXBinaryData &data) = 0; virtual void insertEquation (const WPXPropertyList &propList, const WPXString &data) = 0; };
  15. 23 LibreOffice Paris 2011 Conference Graphic Document Representation namespace libwpg

    { class WPGPaintInterface { public: virtual ~WPGPaintInterface () {} virtual void startGraphics (const ::WPXPropertyList &propList) = 0; virtual void endGraphics () = 0; virtual void setStyle (const ::WPXPropertyList &propList, const ::WPXPropertyListVector &gradient) = 0; virtual void startLayer (const ::WPXPropertyList &propList) = 0; virtual void endLayer () = 0; virtual void startEmbeddedGraphics (const ::WPXPropertyList &propList) = 0; virtual void endEmbeddedGraphics () = 0; virtual void drawRectangle (const ::WPXPropertyList& propList) = 0; virtual void drawEllipse (const ::WPXPropertyList& propList) = 0; virtual void drawPolygon (const ::WPXPropertyListVector &vertices) = 0; virtual void drawPolyline (const ::WPXPropertyListVector &vertices) = 0; virtual void drawPath (const ::WPXPropertyListVector &path) = 0; virtual void drawGraphicObject (const ::WPXPropertyList &propList, const ::WPXBinaryData &binaryData) = 0; virtual void startTextObject (const ::WPXPropertyList &propList, const ::WPXPropertyListVector &path) = 0; virtual void endTextObject () = 0; virtual void startTextLine (const ::WPXPropertyList &propList) = 0; virtual void endTextLine () = 0; virtual void startTextSpan (const ::WPXPropertyList &propList) = 0; virtual void endTextSpan () = 0; virtual void insertText (const ::WPXString &str) = 0; }; } // namespace libwpg
  16. 24 LibreOffice Paris 2011 Conference Key Classes writerperfect/source/filter OdgGenerator.cxx Implementation

    of libwpg::WPGPaintInterface OdtGenerator.cxx Implementation of WPXDocumentInterface OdfDocumentHandler.hxx Abstract SAX interface to receive ODF document
  17. 25 LibreOffice Paris 2011 Conference Importer proper About a 100

    LOC (apart from the real import logic) Existing stream implementation Existing ODF Generator Read document and determine whether it is supported Read document and call callbacks of the interface