Upgrade to PRO for Only $50/Yearβ€”Limited-Time Offer! πŸ”₯

LibreOffice Writer Training

Avatar for Miklos V Miklos V
July 16, 2013
630

LibreOffice WriterΒ Training

What's good to know before reading the source code

Avatar for Miklos V

Miklos V

July 16, 2013
Tweet

Transcript

  1. Writer Training What's good to know before reading the source

    code MiklΓ³s Vajna LibreOffice Developer / Writer 16 July 2013
  2. 2 Overview β€’ Tools helping development β€’ Writer β€’ Document

    model β€’ UNO API β€’ Layout β€’ Filters β€’ Testing β€’ UI β€’ Help β€’ Extending ODF β€’ Editing
  3. 3 Tools helping development β€’ Git: log, blame, bisect β€’

    Ctags / id-utils + http://docs.libreoffice.org β€’ Gdb, Xray, tpconv β€’ Vim / emacs β€’ Pretty-printing: β€’ SAL_DEBUG() β€’ Edit zip file in-place β€’ XML / RTF pretty-printer β€’ Doc-dumper β€’ Specifications: ODF, DOCX, DOC, RTF, etc.
  4. 5 Where is the code? β€’ LibreOffice has many modules

    (225 ATM on master) β€’ Writer-related modules β€’ sw (StarWriter): Writer itself β€’ Document model, layout, some filters β€’ xmloff: (most of) ODF import/export β€’ writerfilter: UNO-based DOCX/RTF import β€’ oox: shared OOXML bits (between DOCX, XSLX, PPTX) β€’ starmath: equation editor
  5. 6 Document model β€’ Writer does MVC as well β€’

    View is called layout, build from frames β†’ also called FCM β€’ One opened document ↔ SwDoc β€’ SwDoc::GetNodes() β†’ SwNode array (has pretty-printer in gdb) β€’ Inside that, building block: paragraphs β€’ One paragraph ↔ SwNode β€’ Terminology: β€’ Word has sections, paragraphs and runs β€’ Writer has page styles, sections, paragraphs and text portions
  6. 7 How properties are stored β€’ SwNode has the paragraph

    text as a single OUString β€’ Properties: β€’ SfxPoolItem β€’ Stored in an SfxItemSet β€’ Think of it as a map<int, any> β€’ β€œint” is called a WhichId: β€’ Writer specific ones are in sw/inc/hintids.hxx β€’ SfxPoolItem is has many subclasses, examples: β€’ Bold: SvxWeightItem (Sv: StarView) β€’ Paragraph adjust: SvxAdjustItem
  7. 8 More on SfxItemSet β€’ Can contain ranges of WhichIds:

    _pWhichRanges β€’ Array of pointers: value β€œn”: start of a range β€’ Value β€œn+1”: end of a range β€’ End of the list: 0 β€’ Can have a parent: think of style inheritance β€’ While debugging: _nCount contains the size β€’ Items are pointers: _aItems β€’ If a property is β€œset”, its pointer is non-zero
  8. 9 Character attributes β€’ Direct formatting is in SwTxtNode::m_pSwpHints β€’

    Each such formatting is a β€œhint” β€’ Either just a character index β€’ E.g. field β€’ Or a start-end (e.g. bold)
  9. 10 How to debug the doc. model β€’ Demo: β€’

    Gdb β€’ Document model XML dump β€’ Xray
  10. 11 UNO API β€’ This is the public API, any

    change to it comes with some cost β€’ Still, not set in stone β€’ Extensions use this, UNO-supported languages (C++, Java, Python etc) can connect to a running soffice using URP β€’ If the document model is changed, the API has to be updated in most cases β€’ We serialize everything to ODF, and that uses the UNO API as well β€’ Bad: slower than necessary β€’ Good: UNO API is kept up to date
  11. 12 UNO API (continued) β€’ When adding a new feature,

    if this is implemented, can read / write the document model β€’ Other approach: implement the UI β€’ Properties themselves: β€’ Most SfxPoolItem has two methods to load / save: β€’ QueryValue() + PutValue() β€’ New frame, paragraph, character, list (etc.) property: β€’ sw/source/core/unocore/ β€’ Maps between UNO's string + any key-value and WhichIds + SfxPoolItems
  12. 13 Layout β€’ Most complex part: β€’ No easy way

    to test automatically β€’ Think of missing fonts on test machines β€’ Document model has only paragraphs, not pages β€’ One opened document ↔ multiple layouts β€’ Try it: Window β†’ new window β€’ Typically single layout: SwRootFrm (root frame) β€’ Inside: pages ↔ SwPageFrm β€’ Paragraphs ↔ SwTxtFrm
  13. 16 Doc. Model β†’ layout notification β€’ SwModify: kind of

    a server, e.g. SwTxtNode β€’ SwClient: the client, e.g. SwTxtFrm β€’ SwModify ↔ SwClient is 1:N β€’ SwModify has Modify(SfxPoolItem* pOld, SfxPoolItem *pNew) β€’ So layout can react without building from scratch β€’ SwClient can only be registered in one SwModify β€’ But SwClient can have multiple SwDepend (which is an SwClient)
  14. 17 Related: textframes and drawings β€’ Writer has its own

    text frame β€’ Can contain anything: tables, columns, fields, etc. β€’ Does not support advanced drawing features β€’ Like rounded corners β€’ Drawinglayer (shared) takes care of all other drawings β€’ Also has a rectangle, with all features one can ever wish β€’ Rounded edges, rotations, etc. β€’ Except it doesn't know about Writer layout, so can't contain fields, etc. β€’ Problem for Word interop: β€’ They don't have this code shared, so combining the above two feature list is possible there
  15. 18 Filters β€’ Every feature stored in the document model

    has to be serialized / loaded back to every file format β€’ Or you loose data β€’ In practice: ODF should not loose data, the rest should be good enough β€’ Important filters: β€’ ODF (.odt) β€’ OOXML (.docx) β€’ WW8 (.doc) β€’ RTF (.rtf) β€’ Rest: HTML, plain text, etc.
  16. 19 ODF filter β€’ If you extend the document model,

    this has to be updated before the change hits a release β€’ So users have at least one format which don't loose data for sure β€’ Mostly uses the UNO API: β€’ Code under xmloff/ β€’ Some Writer-specific bits are using the internal API: β€’ sw/source/filter/xml/ β€’ Is an open standard, proposals for new features can be submitted
  17. 20 OOXML: DOCX β€’ Import: β€’ Uses the UNO API,

    code under writerfilter/ β€’ Tokenizer: β€’ Shared XML parser, model.xml β†’ tokens β€’ Domain mapper: β€’ Handles the incoming stream of tokens and maps them to UNO API β€’ Tokenizer β†’ dmapper traffic is XML logged: β€’ cd writerfilter; make -sr dbglevel=2, then /tmp/test.docx*.XML after load β€’ Export: β€’ Shared with RTF/WW8, uses internal API β€’ sw/source/filter/ww8/docx*
  18. 21 OOXML: shared parts β€’ For drawing and other shared

    parts, writerfilter calls into oox β€’ VML import: oox/source/vml/ β€’ VML export: oox/source/export/vmlexport.cxx β€’ Also: metadata parsing (author date, etc.) β€’ Math expressions: both import/export under starmath/ β€’ starmath/source/ooxml*
  19. 22 WW8 (.doc) β€’ Oldest Writer filter: β€’ Binfilter was

    even older, but it's removed β€’ Import/export somewhat shared β€’ Uses the internal API β€’ Code under sw/source/filter/ww8/ β€’ Shared (doc, xls, ppt) parts: β€’ filter/source/msfilter/ β€’ Using doc-dumper may help
  20. 23 RTF (.rtf) β€’ Export is shared with DOC/DOCX: β€’

    Code under sw/source/filters/ww8/rtf* β€’ Import is shared with DOCX: β€’ Code under writerfilter/source/rtftok/ β€’ Domain mapper is the same for RTF and DOCX β€’ Math: β€’ Import generates OOXML tokens (RTF-specific part is inside the normal RTF tokenizer) β€’ Export is shared with DOCX: β€’ Code under starmath/source/rtf*
  21. 24 Testing β€’ What's easy: filter tests β€’ Both import

    / export β€’ Poke around with xray, then assert the UNO document model β€’ The rest is more challenging β€’ We have uwriter, which has access to private sw symbols β€’ No UI tests – that's still to be figured out
  22. 25 UI β€’ Again, shared with other modules where makes

    sense β€’ Doesn't use the UNO API β€’ Input/output for the dialog is an SfxItemSet β€’ Own toolkit: VCL β€’ Newer dialogs use the GTK .ui format β€’ Glade is a GUI to edit those β€’ If have to touch an older dialog: best to convert it first β€’ Doesn't take too much time
  23. 26 Help β€’ Lots of help buttons on UI β€’

    Typically every existing dialog has a related help page β€’ If you add a new UI element, makes sense to spend a minute on updating the related help β€’ Requires a --with-help build β€’ XML based, also stored in git, just different repo β€’ Offline / online help is generated from that
  24. 27 Extending ODF β€’ ODF is really close to the

    UNO API what we offer β€’ Typically 1 UNO property ↔ 1 XML attribute in ODF β€’ If you extend the UNO API β€’ Go ahead with updating the ODF filter β€’ After implementation is ready: β€’ See https://wiki.documentfoundation.org/Development/ODF_Imple menter_Notes#LibreOffice_ODF_extensions β€’ Submit a proposal to OASIS, so it can be part of the next version of the standard
  25. 28 Bookmarks β€’ Wiki: https://wiki.documentfoundation.org/Development/Writ er β€’ New feature checklist,

    ODF implementer notes, etc. β€’ sw README: http://opengrok.libreoffice.org/xref/core/sw/README β€’ Older Writer notes: β€’ http://cgit.freedesktop.org/libreoffice/build/tree/doc/sw-flr.otl ?h=master-backup β€’ http://cgit.freedesktop.org/libreoffice/build/tree/doc/sw.txt?h= master-backup