for vector graphic formats The fun of reverse- and straight engineering Fridrich Štrba – [email protected] The Document Foundation Software Engineer, SUSE
graphic import filters resulting from the work of LibreOffice community How we did it Framework used Missing file-format documentation Collaboration patterns Incremental reverse-engineering
out there ODF is the future of the humanity Nevertheless, humanity does not know about it as of now Other de facto standards Some people use other Office Suites and graphic applications :( Hard-disks full of bad teenage poetry and indecent drawings in funny formats LibreOffice offers the freedom to read that pile of …
exercise Allows to program for LibreOffice without having to understand its internals Pretty stand-alone functionality communicating with LibreOffice over well defined interfaces … … almost Happy users will reward you You will be the hero of the people who can now read their documents... … and they will get on your nerves listing features that are not converted.
import filters based on libwpg WordPerfect Graphics import filter and libwpg Started by Marc Oude-Kotte and yours faithful Google Summer of Code by Ariya Hidayat in 2006 MS Visio import filter and libvisio Google Summer of Code by Eilidh McAdam in 2011 Guest appearance of re-lab's Valek Filippov Corel Draw import filter and libcdr Work in progress (kind of) started basically at the end of 2011 Will be in LibreOffice 3.6 Check http://dev-builds.libreoffice.org for preview fun
I prefer to speak about future when it becomes a feature Too many projects with declarations of intentions and nothing at the arrival Code speaks louder then press releases Google Summer of Code 2012 An attempt at MS Publisher file-format Valek's personal pet file-format Macromedia Freehand Trying to crowd-source the development
count of reinvented wheels Reuse, embrace and extend ODF as interchange format Way import filters communicate with LibreOffice libwpg's application programming interface Reusing OdgGenerator class implementing this interface Speedy development No need to write any boilerplate code LibreOffice import filter itself about 100 LOC The core written as a standalone library Faster testing
OdgGenerator.?xx Implementation of libwpg::WPGPaintInterface OdfDocumentHandler.hxx Abstract SAX interface to receive ODF document Code that serializes the SAX calls into file (flat ODF and zip- based ODF) *SVGGenerator.?xx Each library has an internal SVG Generator (suboptimal) New libwpg will make the SVG Generator part of the public API
ODF is not trivial Settings Styles Automatic styles Content Provide a linear interface Reuse instead of copying the existing ODF generators Developer does not waste time designing interface Speeds up development by focusing on the essentials
Almost none For libvisio Marginally useful user and developer documentation of MSDN Possibility to save using the VDX (xml) file-format Basically XML dump of the binary (the same concepts) For libcdr Document explaining a bit the CMX exchange format (similar concepts) Reverse engineering Based on re-lab's work oletoy
Focus on getting “some” result early First embedded raster images Libreoffice is able to render them without further processing Next graphic primitives “Everything is just a path” Develop tools along the implementation Introspection tool improved constantly Driven by the need of the implementation Reflecting growing understanding of file-format Don't solve problems that don't exist
II Design the software as you go Some code is better then an abstract design Possibility to find and fix real bugs Little communication overhead Communication by code in git Learning by doing mistakes and fixing Release soon, release often A release every 2-3 weeks Good to have intermediary targets
file-format coverage Incremental reverse-engineering Nobody reinvents a wheel completely Two subsequent versions of the same file-format will have some common DNA Try to parse lower or higher version using the existing parser Fix issues as they appear Importance of a small number of reference documents covering many features Importance of having a working parser for other versions Experience makes difference Different ways of encoding the same information Different ways of structuring
and image content in this document is licensed under the Creative Commons Attribution-Share Alike 3.0 License (unless otherwise specified). "LibreOffice" and "The Document Foundation" are registered trademarks. Their respective logos and icons are subject to international copyright laws. The use of these therefore is subject to the trademark policy. Q&A and Stoning session