Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Give me my drawing back!

Give me my drawing back!

Dragging your Visio, Publisher and CorelDraw files to free-sofware world

Fridrich Strba

February 11, 2013
Tweet

More Decks by Fridrich Strba

Other Decks in Technology

Transcript

  1. 1 Give me my drawing back! Fridrich Štrba Software Engineer,

    SUSE Dragging your Visio, Publisher and CorelDraw files to free-sofware world
  2. 2 Agenda LibreOffice's contribution to wider FOSS eco-system Visio, CorelDraw,

    Publisher,... Interesting parts of the reverse- engineering Incremental reverse-engineering Evolution of file-formats observed
  3. 4 Designed to be re-used LibreOffice uses technologies available in

    the FOSS eco- system We love to give back and share the fruit of our sweat Libwpg, libvisio, libcdr and libmspub Standalone libraries Using the same interface Internal class generating SVG for lazy hackers :) More users, more bug reports and (eventually) fixes Reverse-engineering is by principle trial & error exercise
  4. 5 Visio Import filter - libvisio Google Summer of Code

    2011 Eilidh McAdam Previous reverse-engineering work by re-lab's Valentin Filippov Started with Visio 2000 – Visio 2010 file-formats LibreOffice 3.5 release Visio 2000 and Visio 2002 – version 6 file-format Visio 2003 to Visio 2010 – version 11 file-format Extended in 2012 to ALL Visio file-format versions that ever existed Upcoming LibreOffice 4.0 release Visio 2013 – OOXML-ish version (*.vsdx) Visio 1 – 5 Visio XML Drawings (*.vdx)
  5. 7 CorelDraw import filter - libcdr Work started in late

    2011 Released in LibreOffice 3.6.x Still improving Valek's reverse-engineering work cdr_explorer Some of it reused in sk1 project, which is currently dormant An interesting challenge after the success of libvisio Continuation of a fruitful collaboration Support for ALL CorelDraw file-formats Starting from version 1 (code Waldo) Ending by CorelDraw x6 released in March 2012
  6. 8 Microsoft Publisher Import filter - libmspub Google Summer of

    Code 2012 Brennan T. Vincent Flagship feature of LibreOffice 4.0 Reverse-engineering started by Valek Filippov Completed in tandem. Version support MS Publisher 97 MS Publisher 98/2000 MS Publisher 2002-2013
  7. 10 Progressive development of file-formats Nobody reinvents a wheel from

    scratch It is useful to know the release dates of different versions when doing reverse-engineering Two subsequent versions of the same file-format will have many things in common Design parser to be able to parse lower and higher versions Opened version conditions Guard assumptions by exceptions and be verbose in debug mode Try to parse lower or higher version using the existing parser Fix issues as they appear Importance of a small number of reference documents covering many features
  8. 11 Extending the CorelDraw version coverage (1) Departing point Support

    for versions 7 to x3 Basically the knowledge from cdr_explorer Extending the coverage upwards x4 and x5 Support for RIFF documents inside structured ZIP storage x6 More complicated structure inside the ZIP storage Extending the coverage downwards Version 6 (first 32-bit version) Only some RIFF names different Versions 4 and 5 (16-bit versions) Different way to express coordinates
  9. 12 Extending the CorelDraw version coverage (2) Extending the coverage

    downwards (cont'ed) Version 3 First RIFF based CDR file-format but we did not know it by then Fill and outline information embedded inside the shape Shape transform does not accumulate group transforms Versions 2 and 1 Not RIFF based at all Version 2 more structured With some exception handling both can be parsed alike A header with pointers to different sequences of chunks Implementation of linked list (“type 1”) and shape information (“type 2”) Embedded raster (“type 3” and “6”), group transforms (“type 7”), arrow information (“type 8”),
  10. 13 Extending the Visio version coverage (1) Departing point Versions

    6 and 11 Difference in some offsets and in text encoding Common structure A trailer pointing to “streams” Some “streams” consist in a hierarchical sequence of “chunks” Shapes and text content in “chunks” Bug driven rewrite A document (most likely generated by SDK) Challenged completely our assumptions and led to more generalized parser
  11. 14 Extending the Visio version coverage (2) Microsoft Visio 2013

    Preview We wanted to support it before the official release xml-based (ooxml-ish) file-format (*.vsdx) Another rewrite of the parsers Need to separate more clearly the parsing and information processing Side-effect: support of Visio XML Drawing (*.vdx) Versions 1 to 5 Some “chunks” of type list different An override for readers of some chunks “streams” format very similar Little abstractions and generalizations needed Improved understanding of the file-format Cleaner and simpler parser
  12. 16 Future file-formats to import? Google Summer of Code The

    possibility for a student to work with outstanding mentors Valentin Filippov Your faithful (Altsys, Aldus, Macromedia & Adobe) Freehand File-format partially reverse-engineered The big lines of the structure Ripe to be a successful project A talented student can make difference in LibreOffice
  13. 17 Impact within LibreOffice and the known universe Happy users

    will reward you You will be the hero of the people who can now read their documents... … and they will get on your nerves listing features that are not converted. Users outside LibreOffice Inkscape reuses libvisio and libcdr in 0.49 Calligra reuses libvisio and (possibly) libcdr since 2.5
  14. 18 All text and image content in this document is

    licensed under the Creative Commons Attribution-Share Alike 3.0 License (unless otherwise specified). "LibreOffice" and "The Document Foundation" are registered trademarks. Their respective logos and icons are subject to international copyright laws. The use of these therefore is subject to the trademark policy. QA and Stoning session