Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Collabora Online 25.04 Performance Improvements

Avatar for Miklos V Miklos V
June 04, 2025
0

Collabora Online 25.04 Performance Improvements

Avatar for Miklos V

Miklos V

June 04, 2025
Tweet

Transcript

  1. 2/15 Meet Miklos Vajna • From Hungary • More details:

    https://www.collaboraonline.com/about-us/ • Google Summer of Code 2010 / 2011 • Rewrite of the Writer RTF import/export • Then a full-time LibreOffice developer for SUSE • Finally a contractor at Collabora since 2013
  2. 4/15 tile render + delta ... wired keyboard good tile

    render + delta bad hard disk seek time bluetooth keyboard ... Frankfurt – Milan wired keyboard bad 60Hz frame time Frankfurt – London bluetooth keyboard... mash keyboard / key Meeks typing / key Frankfurt – US East Screen render good Human eye blink Frankfurt – US West Screen render bad pro typist / key Frankfurt – Hong Kong average typist / key "good [web] start/ren... 1 101 201 301 401 501 601 701 1 2 3 9 10 12 15 16,6 27 30 30 90 100 100 100 150 150 160 196 300 700 Sample latencies - Milliseconds - linear plot Do you think it’s latency? Speed of things: Thanks to: • RTINGS - hardware latency • Cloudping – network latency • Web latency • JsFiddle – typing latency
  3. 5/15 Smarter Writer layout on editing • Default view- port:

    entire document, to track invalidations • LOK view-port: visible in the browser window • Sync layout for the later • Async layout for the previous • 446ms → 19ms for a document of 300 pages: 19x faster Stack of function callers Width is proportional time
  4. 6/15 Smarter document load in Writer • Document load creates

    the document • View-port is a setting on the loaded document • We do a sync full layout on the document as part of load • Set a rough viewport as a load parameter • 540 ms → 112 ms speedup
  5. 7/15 Invalidation optimization • For all components • Track a

    bounding box of all painted areas • When there is a clean-start, or a full invalidation: reset to empty • Send a new tile → grow that bbox • When we get an invalidate → crop it against that bbox • Take the canonical view ID into account when doing this (dark mode, spellcheck, etc) • 477 → 8 invalidates during file load (source)
  6. 8/15 Calc: atomic / threading improvement • ScFormulaCell::InterpretFormulaGroup 25997.699 ms

    • 20 threads, mutex locked shared SvNumberFormatter,etc • ScFormulaCell::InterpretFormulaGroup 3215.96 ms • Big rework to avoid mutex, cache NumberFormatter results • ScFormulaCell::InterpretFormulaGroup 2513.94 ms • RefCntPolicy None/ThreadLocal • Thanks to Caolán McNamara
  7. 9/15 Server-side watchdog profiling • Raise a warning when a

    job takes >100ms on the main loop • Watchdog case: trigger something on “stalls” • Specialized watchdog thread which the event loop pings when there is some activity • If nothing updates the watchdog in a reasonable time then the watchdog fires something that perf can detect • What we use is an obscure syscall “futimesat” that basically nothing uses and profile with perf using: • perf record -e syscalls:sys_enter_futimesat
  8. 11/15 Client-side watchdog • Warn after 50ms if we don't

    make it back to the main loop • Thanks to Chris Lord • No warnings → avoids unresponsive UI • Also: more aggressive Javascript tile caching • Reusing tiles is better than nothing (avoids destruct + create) • 150 - 250 tiles as canvases (30-60Mb) • Manage canvas memory better • JS ‘GC’ is not your friend; need to explicitly memory manage these. • Store & manage zstd compressed tiles • Creating an actual ImageBitmap from data: hydrate the tile
  9. 12/15 New webgl-based slideshow • Faster loading presentations, moving away

    from animated SVG • Up to 16x for large slide decks • 4K UHD support • Present from current slide • Present in window • 3D transitions
  10. 13/15 Improved pre-loading • Slideshow: more aggressive pre-fetching • Next

    Previous / Next Slide in direction of movement – 100ms after switch • Tracking global invalidations to manage larger cache properly • Scrolling: fetching and caching around the view area • Take the direction into account
  11. 14/15 Ah-hoc DOM touching → LayoutingService • Ah-hoc DOM touching

    shows like flicker • Instead: LayoutingService • Queue of jobs to be executed inside a single time slot • Similar to a rendering transaction • Meant to produce perfect frames
  12. 15/15 Summary • Performance is not only about CPU usage:

    • Interactivity / latency: server-side watchdog • Memory usage • Perfect frames in the browser: client-side watchdog • Demo servers → long-running CPU profiling • Multi-user testing • How does it “feel” in our community call with ~20 people • Profiling interactive stress testing